Deskripsi Pekerjaan
Are you obsessed with system stability, scalability, and performance? Nexus Cloud Infrastructure is seeking a Senior SRE to join our elite engineering team. You will be instrumental in designing, building, and maintaining the highly available cloud environments that power our global enterprise clients.
We value engineers who automate the mundane, treat infrastructure as code, and thrive in high-stakes production environments.
Tanggung Jawab
- Design and maintain highly scalable, distributed cloud infrastructure on AWS/GCP.
- Drive capacity planning, performance analysis, and system tuning to ensure 99.999% availability.
- Implement CI/CD pipelines to automate deployment workflows and reduce manual overhead.
- Respond to production incidents, conduct blameless post-mortems, and identify systemic improvements.
- Collaborate with development teams to integrate observability, logging, and monitoring best practices.
- Mentor junior engineers and advocate for SRE best practices across the organization.
- Develop custom automation tools using Python, Go, or Bash to solve complex operational challenges.
Kualifikasi
- 5+ years of experience in SRE, DevOps, or Systems Engineering roles.
- Deep expertise in cloud-native infrastructure management (AWS, GCP, or Azure).
- Proven experience with Kubernetes orchestration and containerization (Docker).
- Proficiency in Infrastructure as Code (IaC) tools such as Terraform or Pulumi.
- Strong background in programming/scripting (Go, Python, or Ruby).
- Experience with observability stacks like Prometheus, Grafana, Datadog, or ELK.
- Solid understanding of Linux internals, networking protocols, and security principles.
- Excellent problem-solving skills and the ability to thrive in an on-call rotation environment.