America's Job Portal
Support the design, execution, and scaling of evaluation and annotation programs for agentic AI systems, with a focus on defining metrics, schemas, rubrics, and quality frameworks for multi-step reasoning, tool use, task completion, policy adherence, and safe agent behavior.
Job Description
As a Senior Associate in Consumer & Community Banking, you will support the development and operationalization of evaluation frameworks for agentic AI systems. This role will focus on how AI agents plan, reason, use tools, follow policies, recover from errors, and complete tasks across multi-turn, multi-step workflows. You will partner with data science, machine learning engineering, product, architecture, tech, and Linguistics to define what “good” agent behavior looks like and translate that into measurable evaluation criteria. You will design annotation schemas, create rubrics for agent trajectories, train annotators, lead calibration exercises, maintain gold and challenge d...