Design and implement self-contained evaluation tasks, including prompts, supporting files, and detailed grading rubrics.
Define clear, unambiguous written criteria to assess AI performance across diverse administrative and workflow scenarios.
Meticulously observe and document AI agent behaviors, producing precise summaries and reports in high-quality English.
Iterate and refine evaluation tasks and rubrics based on feedback to ensure robust benchmarking methodologies.
Collaborate cross-functionally to adapt evaluation frameworks as project requirements evolve.
Requirements
Minimum 3 years of experience in roles emphasizing written precision and structured thinking (e.g., paralegal, technical writer, QA analyst, or research assistant).
Native or fluent English writing skills with the ability to produce succinct and unambiguous observations.
Proven skill in designin...
Ready to Apply?
Join thousands of Americans building their careers