Position Overview
Work Type: Full-Time | Onsite
Role Overview
We are seeking a highly skilled Site Reliability Engineer (SRE) / Cloud Operations Engineer to support and enhance our global infrastructure. This role focuses on ensuring the reliability, scalability, and security of our cloud environments, while driving automation and operational excellence across the organization. You will work closely with cross-functional teams to build resilient systems and continuously improve platform performance.
Responsibilities
Optimize and maintain high-availability GCP production environments and Linux-based middleware across global operations
Lead Site Reliability Engineering (SRE) initiatives, including designing observability frameworks, implementing automated monitoring, and developing self-healing systems
Drive automation through Infrastructure as Code (IaC) using tools such as Terraform and Ansible, and develop internal tools to reduce operational overhead
Oversee cloud security ...