This Position is Closed
This job is no longer accepting applications. Check out similar opportunities below or browse all active jobs.
Job Highlights
AI-extracted key information
This remote opportunity at Datadog is ideal for those transitioning from traditional jobs as it combines the flexibility of remote work with the chance to engage in cutting-edge AI technology. The hybrid workplace model allows for a balanced work-life harmony, making it easier for new remote workers to adapt.
Experience Level
Senior Level (5-10 years)
Senior Software Engineer (MLOps) – Annotation & Evaluation
Posted 1 months ago
Full-Time
Employment Type
Remote
Work Location
About This Role
At Datadog, we are building an internal AI platform that empowers our teams to train, evaluate, and deploy models at scale. The Annotation & Evaluation team plays a foundational role in ensuring our models are reliable, safe, and production-ready. We design the infrastructure and tooling for dataset labeling, model benchmarking, trust & safety evaluation, and performance diagnostics across a range of ML and LLM applications.
From interactive labeling pipelines to automated evaluation environments, our systems provide the core feedback loop that allows engineers and scientists to measure, compare, and continuously improve AI models. We work at the intersection of applied ML, data engineering, and platform infrastructure.
We’re looking for a Senior Software Engineer to help us scale our evaluation systems, develop benchmarking tools, and drive trust & safety observability across Datadog's AI product offerings.
At Datadog, we place value in our office culture - the relationships that it builds, the creativity it brings to the table, and the collaboration of being together. We operate as a hybrid workplace to ensure our employees can create a work-life harmony that best fits them.
What You’ll Do
Design and build systems that support automated evaluation of AI models, including LLMs and agents, using production-like telemetry and scenarios.
Lead efforts to develop benchmark suites, evaluation pipelines, and model comparison tooling with built-in trust & safety metrics.
Build and maintain integrations with labeling systems (e.g., Label Studio) and coordinate with external/internal annotation workflows.
Collaborate closely with Applied AI and Bits AI teams to enable fast iteration, reproducible experimentation, and interpretable evaluations.
Develop data pipelines that feed metrics, results, and alerts into our observability stack to track model behavior at scale.
Drive best practices around safe model deployment by building systems for bias checks, hallucination detection, and human-in-the-loop review.
Who You Are
You have 6+ years of software engineering experience, including 2+ years working with ML/AI products or platforms.
You have experience building backend or data infrastructure systems, especially for model evaluation, benchmarking, or labeling workflows.
You’re comfortable working across the stack: from orchestrating large-scale data processing pipelines to building UI integrations for labeling or eval dashboards.
You have a solid understanding of ML concepts, including model performance metrics, prompt evaluation, and trust & safety concerns.
You’re fluent in Python or Go and familiar with cloud-native development patterns.
Bonus points: experience with open-source evaluation frameworks (e.g., lm-eval-harness), vector DBs, or human feedback loops.
Datadog values people from all walks of life. We understand not everyone will meet all the above qualifications on day one. That's okay. If you’re passionate about technology and want to grow your skills, we encourage you to apply.
Benefits And Growth
New hire stock equity (RSUs) and employee stock purchase plan (ESPP)
Continuous professional development, product training, and career pathing
Intradepartmental mentor and buddy program for in-house networking
An inclusive company culture, ability to join our Community Guilds (Datadog employee resource groups)
Access to Inclusion Talks, our internal panel discussions
Free, global mental health benefits for employees and dependents age 6+
Competitive global benefits
Benefits and Growth listed above may vary based on the country of your employment and the nature of your employment with Datadog.
About Datadog
Datadog (NASDAQ: DDOG) is a global SaaS business, delivering a rare combination of growth and profitability. We are on a mission to break down silos and solve complexity in the cloud age by enabling digital transformation, cloud migration, and infrastructure monitoring of our customers’ entire technology stacks. Built by engineers, for engineers, Datadog is used by organizations of all sizes across a wide range of industries. Together, we champion professional development, diversity of thought, innovation, and work excellence to empower continuous growth. Join the pack and become part of a collaborative, pragmatic, and thoughtful people-first community where we solve tough problems, take smart risks, and celebrate one another. Learn more about #DatadogLife on
,
LinkedIn,
and
Datadog Learning Center.
Equal Opportunity At Datadog
Datadog is proud to offer
Equal Employment Opportunity
to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and other characteristics protected by law. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. Here are our
Candidate Legal Notices
for your reference.
Datadog endeavors to make our Careers Page accessible to all users. If you would like to contact us regarding the accessibility of our website or need assistance completing the application process, please complete
this form
. This form is for accommodation requests only and cannot be used to inquire about the status of applications.
Privacy And Ai Guidelines
Any information you submit to Datadog as part of your application will be processed in accordance with Datadog’s
Applicant and Candidate Privacy Notice
. For information on our AI policy, please visit
Interviewing at Datadog AI Guidelines
.
Apply to Multiple Jobs with AI
Let our AI automatically apply to hundreds of remote jobs on your behalf. Just upload your resume and set your preferences.
500+
Jobs Applied
24/7
Auto-Apply
5 min
Setup Time
Similar Active Opportunities
Forward Deployed Engineer
Who Are We? Postman is the world’s leading API platform, used by more than 45 million+ developers and 500,000 organizations, including 98% of the Fort...
About Vercel: Vercel gives developers the tools and cloud infrastructure to build, scale, and secure a faster, more personalized web. As the team behi...
Staff Engineer, Identity
Who Are We? Postman is the world’s leading API platform, used by more than 45 million+ developers and 500,000 organizations, including 98% of the Fort...
