hit counter
Beranda Loker Detail
N
Information Technology 🏢 Full Time ⭐️ Terverifikasi

Senior Site Reliability Engineer (SRE)

NexusCloud Systems
San Francisco
Estimasi Gaji
USD 175.000 – USD 225.000
Live Update
2 Juni 2026
Batas Akhir
2 Jun 2027

Deskripsi Pekerjaan

Elevate the Future of Cloud Infrastructure

NexusCloud Systems is seeking a visionary Senior Site Reliability Engineer to join our high-impact SRE team in San Francisco. You will be at the forefront of designing, building, and scaling our global cloud infrastructure, ensuring 99.999% availability for our mission-critical enterprise platforms. We are looking for an expert in distributed systems who thrives on automating away manual toil and driving resilience into complex architecture.

Tanggung Jawab

  • Design and maintain highly available, scalable, and resilient distributed systems on AWS and GCP.
  • Automate infrastructure provisioning and configuration management using Terraform and Ansible.
  • Champion observability practices by implementing advanced monitoring, logging, and tracing solutions (Datadog, Prometheus).
  • Lead incident response, perform blameless post-mortems, and drive architectural improvements to prevent recurrence.
  • Collaborate with development teams to integrate CI/CD pipelines and ensure seamless deployment cycles.
  • Develop and maintain Kubernetes clusters at scale to support microservices architecture.
  • Define and track Service Level Objectives (SLOs) and Error Budgets to balance feature velocity with system stability.

Kualifikasi

  • Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience.
  • 5+ years of experience in SRE, DevOps, or Systems Engineering roles.
  • Deep expertise in Kubernetes, Docker, and container orchestration platforms.
  • Proficiency in programming with Go, Python, or Ruby for infrastructure automation.
  • Hands-on experience with IaC tools such as Terraform or CloudFormation.
  • Strong background in Linux internals, networking (TCP/IP, DNS, Load Balancing), and security best practices.
  • Proven ability to troubleshoot complex issues across the entire stack in high-pressure environments.

Keahlian yang Dibutuhkan

Kubernetes AWS GCP Terraform Go Python Observability CI/CD Distributed Systems Linux

Siap Mengambil Tantangan Ini?

Pastikan resume Anda sudah siap. Kirimkan lamaran Anda sekarang sebelum tanggal deadline.

Lamar Sekarang

Lowongan Terkait

Rekomendasi pekerjaan serupa untuk Anda

Lihat Semua