See all roles

QA Engineer – AI Systems

Work from home Full-time role Hiring

Dice is the leading career destination for tech experts at every stage of their careers. Our client, Everest Technologies, is seeking the following. Apply via Dice today! We are seeking a QA Engineer with a strong background in API testing and LLM fine-tuning/evaluation . You will be responsible for the quality assurance of our Agent Mesh infrastructure, ensuring that the correctly translate enterprise business logic into machine-readable actions. Your goal is to ensure that AI agents interact with our reliably, securely, and without hallucinating tool calls.

Key Responsibilities

  • AI Tool Validation: Test the accuracy of by verifying that LLMs correctly interpret OpenAPI specifications and trigger the right C#/.NET backend logic.
  • Fine-Tuning Data Preparation: Curate and clean high-quality datasets (JSON/JSONL) in Python to fine-tune models for specific domain tasks and tool-calling accuracy.
  • Prompt Regression Testing: Develop automated test suites to ensure that updates to underlying APIs or MCP servers do not break the reasoning or planning capabilities of the AI agents.
  • Security & Auth QA: Validate that in Gravitee correctly enforce OAuth 2.1 and OpenFGA, preventing unauthorized data leakage through agent conversations.
  • Performance Testing: Use to measure latency in the agent-to-API loop and identify bottlenecks in MCP server responses.

Technical Qualifications

  • API Testing Mastery: Expert knowledge of REST, OpenAPI, and tools like Postman or Insomnia.
  • Scripting: Proficiency in Python (for data processing and eval frameworks) and familiarity with C# (to understand backend MCP implementation).
  • LLM Evaluation: Experience with frameworks like DeepEval, Ragas, or LangSmith to measure model performance (faithfulness, relevancy, and tool-call precision).
  • API Management: Hands-on experience with or similar gateways to monitor and intercept traffic.
  • Model Context Protocol: Understanding of and how it standardizes the way LLMs access external data.

Preferred Skills

  • Experience with Red Teaming AI agents to identify prompt injection vulnerabilities.
  • Knowledge of Vector Databases and how RAG (Retrieval-Augmented Generation) interacts with live API tools.
  • Familiarity with GitHub Actions for CI/CD integration of AI evaluation pipelines.

Apply tot his job Apply To this Job

You might like

# Experienced Cyber Threat Intelligence Analyst – Strategic Security Solutions

Work from home Full-time role

[Remote] Sr. QA Engineer

Work from home Full-time role

QA Engineer

Work from home Full-time role

QA Engineer - Code Review Expert

Work from home Full-time role

Experienced Senior Cybersecurity Threat Intelligence Analyst – Strategic Digital Defense & Risk Assessment

Work from home Full-time role

Principal QA Engineer

Work from home Full-time role

Senior Supplier Quality Assurance Engineer

Work from home Full-time role

Manual QA Engineer

Work from home Full-time role

Senior Software QA Engineer / Automation Engineer US

Work from home Full-time role

Lead QA Engineer

Work from home Full-time role

Experienced Work From Home Data Entry Specialist – Travel Industry

Work from home Full-time role

AI Policy Lead

Work from home Full-time role

Experienced Customer Service Representative – Delivering Exceptional Experiences for arenaflex Customers (Remote)

Work from home Full-time role

Underwriting Consultant - Fine Arts (Hybrid or Remote)

Work from home Full-time role

Data Engineer Analyst

Work from home Full-time role

Immediate Hiring: Remote Customer Service Representative - Outreach Specialist at arenaflex

Work from home Full-time role

Senior Software Engineer, Core Experiences - College Park, MD, USA

Work from home Full-time role

Lending Support Specialist

Work from home Full-time role

Senior Manager, Business Development

Work from home Full-time role

Physical Therapist Assistant - Digital Health Specialist - Hybrid Role - Outpatient(2 days remote/3 days in clinic)

Work from home Full-time role