See all roles

Automation QA Engineer

Work from home Full-time role Hiring

Ciklum is looking for a Automation QA Engineer to join our team full-time in Poland. We are a custom product engineering company that supports both multinational organizations and scaling startups to solve their most complex business challenges. With a global team of over 4,000 highly skilled developers, consultants, analysts and product owners, we engineer technology that redefines industries and shapes the way people live. About the role: As a Automation QA Engineer, become a part of a cross-functional development team engineering experiences of tomorrow. The Project: We are partnering with B&R Industrial Automation to significantly upgrade the Retrieval-Augmented Generation (RAG) architecture of their AS client's system. Our goal is to drastically reduce AI hallucinations in code generation and optimize retrieval latency without re-architecting their existing platform. Responsibilities: Own the evaluation lifecycle, offline acceptance testing, and KPI measurement for the AS client's RAG pipeline Lead the co-creation and management of the project's "golden dataset" to consistently benchmark AI performance Implement and manage the RAGAS evaluation harness and automated CI/CD regression testing Track, classify, and build root-cause taxonomies for LLM hallucinations, with a specialized focus on code-generation correctness Golden Dataset & Baselines: Collaborate with client domain experts and technical leads to build a robust synthetic test set (~90+ queries across multiple categories) and establish baseline metrics for Faithfulness, Context Precision, and Answer Relevance Evaluation Harness: Build and automate evaluation pipelines using RAGAS and custom Python scripts, enabling A/B comparisons between the baseline, MVP, and full implementation Regression & CI/CD Guardrails: Implement automated CI/CD regression checks within Azure DevOps, ensuring that a >5% drop in core metrics automatically blocks pipeline deployments Hallucination Tracking: Develop a root-cause taxonomy for hallucinations and track code-generation queries separately to ensure the AI generates functionally correct and compilable output Performance Benchmarking: Measure and monitor pipeline latency, rigorously validating P95 latency targets (sub-4.5s) under representative concurrent load Requirements: Background: Mid-to-Senior level experience in Data Science, Machine Learning Evaluation, AI Quality Assurance, or Data Engineering Evaluation Frameworks: Deep, hands-on experience with LLM evaluation frameworks (e.g., RAGAS, DeepEval, TruLens) and establishing human-anchored or synthetic benchmarks Technical Stack: Strong proficiency in Python. Solid experience with CI/CD tools (especially Azure DevOps) and integrating complex test suites into automated deployment pipelines Data & Observability: Experience working with databases (PostgreSQL) and integrating custom telemetry or observability data (e.g., Azure App Insights) into evaluation reports Analytical Mindset: Strong attention to detail with the ability to perform rigorous error analysis, build structured taxonomies for failures, and identify embedding drift Personal skills: Highly collaborative and data-driven; comfortable working directly with client SMEs to validate queries and presenting evaluation scorecards to guide engineering decisions What`s in it for you? Strong community: Work alongside top professionals in a friendly, open-door environment Growth focus: Take on large-scale projects with a global impact and expand your expertise Tailored learning: Boost your skills with internal events (meetups, conferences, workshops), Udemy access, language courses, and company-paid certifications Endless opportunities: Explore diverse domains through internal mobility, finding the best fit to gain hands-on experience with cutting-edge technologies Flexibility: Enjoy flexibility – full remote working possibilities Care: We’ve got you covered with company-paid medical insurance, mental health support, and financial & legal consultations About us: At Ciklum, we are always exploring innovations, empowering each other to achieve more, and engineering solutions that matter. With us, you’ll work with cutting-edge technologies, contribute to impactful projects, and be part of a One Team culture that values collaboration and progress. With delivery centers in Wrocław and Gdańsk, our 300+ professionals in Poland drive forward-thinking solutions for global clients. Join a community where collaboration sparks innovation—and your impact reaches millions. Want to learn more about us? Follow us on Instagram, Facebook, LinkedIn. Explore, empower, engineer with Ciklum! Interested already? We would love to get to know you! Submit your application. We can’t wait to see you at Ciklum. #LI-MP1 Apply To This Job

You might like

Senior PS Consultant(Talkdesk&Genesys Cloud)

Work from home Full-time role

QA Analyst (Mortgage) - WAH

Work from home Full-time role

Analyst - Sourcing

Work from home Full-time role

Application Engineer

Work from home Full-time role

Mecánico/a mercado ITALIANO para Asistencia Técnica a talleres

Work from home Full-time role

Marketing Intern (Technical Content + Developer Community) - Remote

Work from home Full-time role

Commercial Solutions Specialist - Automotive Industry (Remote)

Work from home Full-time role

Account Lead

Work from home Full-time role

SEO Specialist

Work from home Full-time role

Senior Forensic Engineer - Accident Reconstruction

Work from home Full-time role

Physical Therapy Assistant, PRN, Acute Care Rehab

Work from home Full-time role

Junior Azure Administrator

Work from home Full-time role

Software Engineer (Generalist)

Work from home Full-time role

Manager, Behavioral Health Utilization Management

Work from home Full-time role

Literary Writer - Fully Remote

Work from home Full-time role

Software Integrity Engineer, Camera & Photos

Work from home Full-time role

Virtual Medical Assistant (Insurance Verification & Patient Coordination)

Work from home Full-time role

Remote Part‑Time Data Entry Specialist – Flexible Home‑Based Role with arenaflex

Work from home Full-time role

Staff AI Engineer | US | Remote

Work from home Full-time role

Experienced Bilingual Customer Service Representative – Community Resource Navigator – Florida Residency Required

Work from home Full-time role