hit counter
Beranda Loker Detail
N
Information Technology 🏢 Full Time ⭐️ Terverifikasi

Senior Site Reliability Engineer (SRE)

NexusCloud Systems
San Francisco
Estimasi Gaji
USD 175.000 – USD 225.000
Live Update
2 Juni 2026
Batas Akhir
2 Jun 2027

Deskripsi Pekerjaan

Are you obsessed with uptime, scalability, and system performance? NexusCloud is seeking a high-impact Senior Site Reliability Engineer to join our core infrastructure team. In this role, you will bridge the gap between software development and IT operations, ensuring our global cloud architecture remains resilient and highly performant under heavy load.

You will work alongside elite engineers to automate provisioning, optimize latency, and drive our incident management strategy. If you thrive in a fast-paced environment and love solving complex distributed systems puzzles, we want to hear from you.

Tanggung Jawab

  • Design and implement robust monitoring, alerting, and logging systems to ensure 99.99% service availability.
  • Lead the automation of infrastructure provisioning and configuration management using Terraform and Ansible.
  • Conduct deep-dive post-mortems and root cause analysis for production incidents to prevent recurrence.
  • Develop and maintain CI/CD pipelines to streamline deployment velocity and reliability.
  • Optimize cloud resource utilization to balance performance needs with cost-efficiency.
  • Collaborate with cross-functional product teams to define and meet rigorous Service Level Objectives (SLOs).
  • Mentor junior team members on SRE best practices and operational excellence.

Kualifikasi

  • Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience.
  • 5+ years of experience in SRE, DevOps, or large-scale systems engineering.
  • Expert-level proficiency in public cloud environments (AWS, GCP, or Azure).
  • Strong hands-on experience with Kubernetes, Docker, and container orchestration at scale.
  • Advanced scripting skills in Python, Go, or Ruby for automation and tool development.
  • Deep understanding of networking protocols, load balancing, and distributed systems architecture.
  • Proven ability to thrive in an on-call rotation and handle incident response effectively.

Keahlian yang Dibutuhkan

Kubernetes AWS Terraform Python Go CI/CD Distributed Systems Observability Linux

Siap Mengambil Tantangan Ini?

Pastikan resume Anda sudah siap. Kirimkan lamaran Anda sekarang sebelum tanggal deadline.

Lamar Sekarang

Lowongan Terkait

Rekomendasi pekerjaan serupa untuk Anda

Lihat Semua