← Back to USA Jobs

GPU Infrastructure Support Engineer

Company

CloudEngine Digital

Location

kuala lumpur, kuala lumpur

Posted

May 31, 2026

Position Overview

We are seeking a Infra Support Engineer  to join the Global Infrastructure team. This role focuses on GPU system delivery, incident detection, triage, basic remediation, runbook execution, monitoring and clear escalation to the SRE (Site Reliability Engineering) team while helping improve operational runbooks and observability. 
Responsibilities Provide first/second-line technical support to customers for the AI Infrastructure (GPU/CPU nodes, networking, storage, orchestration, platform services) via ticketing systems, emails, Slack, or other messaging systems. 
Monitor system health and service-level indicators (alerts, dashboards); respond to alerts 24x7 as scheduled. 
Triage incidents, gather context, verify scope and impact, follow standard operating procedures and runbooks to perform immediate mitigations. 
Escalate to the global SRE engineers with clear, concise incident notes and relevant logs/traces. 
Maint...
        

🇺🇸 USAJobs.work

GPU Infrastructure Support Engineer

Position Overview

Responsibilities

Ready to Apply?