[Remote] AI Validation Engineer
Note: The job is a remote job and is open to candidates in USA. Agility Partners is hiring for QA / AI Quality Engineer roles to support critical AI initiatives across its digital ecosystem. The role involves designing and implementing validation frameworks for AI-powered systems, ensuring outputs are accurate and reliable, and collaborating with engineering and product teams to improve system quality.
Responsibilities
- Design and implement validation and QA frameworks for AI-powered systems, including analytics platforms, LLM-driven applications, and agentic workflows
- Evaluate AI outputs for accuracy, grounding, consistency, and alignment with underlying data and expected behavior
- Test and validate key components such as:
- RAG pipelines and retrieval quality
- Data pipelines and transformations
- Prompt behavior and response consistency
- API integrations and downstream system interactions
- Build automated testing frameworks for structured and unstructured AI systems
- Define and track metrics around accuracy, data integrity, performance, and system reliability
- Identify discrepancies between AI outputs and source data or expected outcomes
- Partner closely with engineering, data, and product teams to improve system quality and reduce defects
- Support rapid iteration cycles by developing scalable evaluation and regression testing approaches
- Monitor AI systems post-deployment to detect drift, degradation, and performance issues
- Help implement guardrails to improve safety, reduce hallucinations, and ensure trustworthy outputs
- Support auditability and traceability of AI-driven systems and insights
Skills
- 3–7+ years of experience in QA, Software Testing, Data Validation, or AI/ML-focused quality engineering
- Strong Python skills for automation, testing, and data validation
- Experience building automated test frameworks and validation pipelines
- Experience testing AI/ML or data-driven systems (LLMs, analytics platforms, or similar)
- Strong SQL or data querying skills and ability to validate large datasets
- Understanding of data pipelines, transformations, and data quality validation
- Experience validating APIs and service integrations
- Ability to analyze logs, outputs, and system behavior to diagnose issues
- Comfortable working in fast-paced, agile environments with evolving requirements
- Experience with LLM evaluation, prompt testing, or RAG-based systems
- Familiarity with agentic workflows or orchestration patterns
- Exposure to LLMOps, AI evaluation tooling, or observability frameworks
- Experience with model monitoring, drift detection, or performance tracking
- Familiarity with analytics platforms or BI tools
- Understanding of explainability techniques, governance, or risk in AI systems
- Awareness of cost, latency, and performance considerations in AI solutions
Company Overview
Company H1B Sponsorship