America's Job Portal
A successful Site Reliability Engineer will have:
Experience
β’ Minimum 3+ years of hands-on experience running AWS production systems at
scale
β’ Proven expertise with AWS EKS (Elastic Kubernetes Service) or similar and MSK
(Managed Streaming for Kafka) in production environments as well as database
performance diagnostics (MySQL, Postgres, MongoDB) in multi-TB scale databases
β’ Strong background in Infrastructure as Code, preferably with Pulumi using
TypeScript or equivalent Terraform experience
β’ Demonstrated experience participating in incident management (ideally as an
incident commander with a track record of leading post-mortem processes)
β’ Experience with high-volume data processing systems, ideally IoT telemetry or
streaming pipelines processing β₯50k messages per second
β’ Background in implementing and maintaining observability solutions using
Prometh...