hit counter
Beranda Loker Detail
N
Information Technology 🏢 Full Time ⭐️ Terverifikasi

Senior Site Reliability Engineer (SRE)

NexusCloud Systems
San Francisco
Estimasi Gaji
USD 175.000 – USD 220.000
Live Update
2 Juni 2026
Batas Akhir
2 Jun 2027

Deskripsi Pekerjaan

Are you obsessed with system performance, scalability, and uptime? NexusCloud Systems is looking for a Senior Site Reliability Engineer to help us build and maintain high-traffic cloud infrastructure. In this role, you will bridge the gap between development and operations, ensuring our platform remains resilient in a fast-paced environment.

You will play a pivotal role in shaping our SRE culture, automating manual toil, and optimizing our AWS-based microservices architecture. If you thrive on solving complex distributed systems problems and championing reliability, we want to hear from you.

Tanggung Jawab

  • Design, build, and maintain highly available, scalable, and secure cloud infrastructure on AWS.
  • Automate operational tasks through infrastructure-as-code (Terraform, Ansible) to reduce manual toil.
  • Implement proactive monitoring, logging, and alerting strategies using Datadog and Prometheus.
  • Lead incident response, root cause analysis, and post-mortem reviews to improve system reliability.
  • Collaborate with cross-functional software engineering teams to optimize application performance.
  • Develop and manage CI/CD pipelines to ensure seamless and reliable code deployments.
  • Establish and maintain Service Level Objectives (SLOs) and Error Budgets for core services.

Kualifikasi

  • 5+ years of experience in Site Reliability Engineering, DevOps, or Systems Engineering.
  • Expert-level proficiency with AWS ecosystem (EC2, EKS, RDS, S3, IAM).
  • Strong coding skills in Python, Go, or Ruby for automation and tool development.
  • Hands-on experience with Kubernetes orchestration and containerization (Docker).
  • Advanced knowledge of Linux systems administration, networking, and security best practices.
  • Proven ability to troubleshoot complex performance issues in distributed systems.
  • Strong communication skills and a collaborative mindset for cross-team initiatives.

Keahlian yang Dibutuhkan

AWS Kubernetes Terraform Python Go CI/CD Prometheus Datadog Distributed Systems

Siap Mengambil Tantangan Ini?

Pastikan resume Anda sudah siap. Kirimkan lamaran Anda sekarang sebelum tanggal deadline.

Lamar Sekarang

Lowongan Terkait

Rekomendasi pekerjaan serupa untuk Anda

Lihat Semua