Position Overview
- Design and develop scalable, reliable enterprise architectures - Build automation tools to improve system reliability and reduce manual effort - Drive system performance, latency, and availability improvements - Participate in capacity planning, demand forecasting, and system tuning - Troubleshoot large-scale production issues and ensure rapid resolution - Define and track SRE metrics such as SLOs, SLIs, error budgets, and latency - Implement monitoring, alerting, and observability frameworks - Collaborate with infrastructure and business teams for operational excellence - Lead and mentor engineering teams and foster a high-performance culture - Ensure adherence to engineering best practices and continuous improvement