Position Overview
AWS Annapurna Labs is seeking a Senior Manager of Quality & Reliability Engineering to lead the QnR function within the Trainium Manufacturing, Quality and Reliability organization. You will own quality and reliability outcomes for all Trainium AI server products β from component qualification through fleet performance β leading an engineering team across multiple concurrent chip and system generations. This role defines reliability strategy for liquid-cooled and air-cooled platforms at rapidly scaling volumes, builds quality systems across a multi-supplier global manufacturing base, drives fleet failure investigations to root cause, and establishes the reliability characterization capabilities required for next-generation technologies.
Key job responsibilities
- Lead and grow a QnR engineering team, hiring, developing, and retaining top reliability and quality engineering talent.
- Set technical direction for component qualification, reliability testing (HALT, HTOL, ...