← Back to USA Jobs

Senior Data Architect

Company

Omilia

Location

remote, romblon

Posted

June 02, 2026

Position Overview

Accountabilities Own the Training Environment data architecture end-to-end: dataset design and schema for all ML training pipelines, including dialog corpora for LLM training, conversational steps for NLU models, annotated evaluation sets, and whole-call recordings for speech-to-speech model development. 
Define and govern data selection and sampling strategy: establish criteria that determine which production conversations have the highest training value, including diversity‑optimized sampling, confidence‑based filtering, edge‑case prioritization, and deduplication strategies. 
Build and maintain the data catalog and dataset discovery infrastructure: enable ML engineers across LLM, NLU, Speech, and Agentic teams to find, understand, and use training data without friction. 
Define annotation pipeline architecture: establish requirements for data labeling—intent annotation, entity tagging, dialog act classification, task completion scorin...
        

🇺🇸 USAJobs.work

Senior Data Architect

Position Overview

Accountabilities

Ready to Apply?