Lead / Senior QA Engineer – with Temporal and LLM, Langfuse

Remote, USA
Posted Jun 12, 2026
Full-time

Dice is the leading career destination for tech experts at every stage of their careers. Our client, Vertical Falls LLC, is seeking the following. Apply via Dice today! • *Job Title Lead / Senior QA Engineer – Agentic AI Systems WITH Langfuse , Temporal 100% Remote Interview Mode2 Video 6-12 MONTHS CONTRACT We are looking for a highly skilled QA professional to build and scale a next-generation Agentic AI Quality Engineering function .

This role goes beyond traditional QA—focusing on validating autonomous AI systems, designing evaluation frameworks, and ensuring high-quality outputs across multiple AI-driven products. You will play a critical role in shaping how quality is defined, measured, and improved for agentic systems that operate with minimal human intervention. Key Responsibilities • Agentic QA Strategy & Scaling • Design and scale an agentic QA model for autonomous AI systems • Move QA from human-driven validation to AI-led evaluation and continuous quality monitoring • Establish best practices for testing AI agents across lifecycle stages • Product Quality Ownership Own QA for 3 core AI products • AI Contact Center solutions • AI Chat & Form-based interaction systems • AI Assistants (autonomous / semi-autonomous agents) • Define quality benchmarks, SLAs, and success metrics for each product • Proactively identify quality gaps ahead of customer impact • Metrics, Observability & Evaluation • Define and track performance outputs for agentic systems (accuracy, latency, resolution quality, hallucination rate, etc.) • Build frameworks for • Evals & graders (LLM evaluation pipelines) • Output scoring and benchmarking • Continuous feedback loops • Leverage tools like Langfuse for • LLM observability and tracing • Prompt monitoring and performance analysis • Debugging agent behavior in production • Analyze • Downstream issues • Production tickets • Failure patterns • Automation & Testing Frameworks • Build and scale automation across • Regression testing • Smoke testing • End-to-end agent workflows • Develop and maintain Playwright-based automation scripts • Integrate QA into CI/CD pipelines for continuous validation • Agentic Testing & Validation • Design testing approaches for • Multi-step agent workflows • Context retention and reasoning • Tool usage by agents • Work with orchestration frameworks like Temporal to • Validate long-running workflows • Test retries, state transitions, and failure handling in agent pipelines • Account for non-deterministic behavior in AI systems • Invest additional effort in agentic validation, recognizing higher complexity vs traditional QA • Continuous Improvement & Innovation • Define frameworks to predict and prevent failures before customer exposure • Continuously improve QA processes using AI and automation • Partner with Product, Engineering, and AI teams to improve system quality Required Skills & Experience • 5–10+ years in QA / Quality Engineering, with strong automation experience • Hands-on experience with • Test automation tools (Playwright preferred) • API and system testing • Strong understanding of • AI/ML systems (LLMs, conversational AI preferred) • Evaluation frameworks and benchmarking • Experience with • Temporal (workflow orchestration, stateful systems testing) • Langfuse (LLM observability, tracing, and evaluation) • Experience in • Building QA frameworks from scratch • Working with production data, logs, and issue triaging Good to Have • Experience with LLM eval frameworks, prompt testing, or AI red-teaming • Familiarity with agentic architectures / autonomous systems • Exposure to observability and analytics platforms Working Model • Prefer candidates with EST time zone overlap • Ability to work closely with global product and engineering teams What Success Looks Like • A scalable, automated QA system for agentic products • Measurable improvement in AI output quality and reliability • Reduced production issues and faster detection of failures QA evolving from reactive testing to proactive quality intelligence Apply tot his job

More Remote Jobs