🇺🇸 USAJobs.work

America's Job Portal

← Back to USA Jobs

Senior Data Architect

Company

Omilia

Location

remote, romblon

Posted

June 02, 2026

Position Overview

Accountabilities

  • Own the Training Environment data architecture end-to-end: dataset design and schema for all ML training pipelines, including dialog corpora for LLM training, conversational steps for NLU models, annotated evaluation sets, and whole-call recordings for speech-to-speech model development.
  • Define and govern data selection and sampling strategy: establish criteria that determine which production conversations have the highest training value, including diversity‑optimized sampling, confidence‑based filtering, edge‑case prioritization, and deduplication strategies.
  • Build and maintain the data catalog and dataset discovery infrastructure: enable ML engineers across LLM, NLU, Speech, and Agentic teams to find, understand, and use training data without friction.
  • Define annotation pipeline architecture: establish requirements for data labeling—intent annotation, entity tagging, dialog act classification, task completion scorin...

Ready to Apply?

Join thousands of Americans building their careers

Apply Now