Job Description
We are seeking a Lead AI Red Teaming & QA Engineer to design and execute automated adversarial testing for our enterprise Agentic AI platforms. You will move beyond traditional software QA to build continuous safety pipelines, ensuring our non-deterministic LLM agents, RAG systems, and tool integrations are secure, resilient, and compliant before production release. Key Responsibilities Automated Adversarial Testing: Build and integrate automated red teaming suites into CI/CD pipelines using frameworks like Garak , Pyrit , and AgentDojo to enforce strict safety release gates. AI Evaluation Frameworks: Develop metrics and continuous testing for core AI risks, including hallucinations, memorisation, algorithmic bias, uncertainty, and model drift . Regulatory Compliance Evidence: Map threat models (OWASP LLM Top 10, Agentic threats) to automated test cases. Produce the technical testing evidence required by EU AI Act Article 15 , DORA , and FCA Operational Resilience guidelines. Centralised AI-BOM Platform: Own the enterprise AI Bill of Materials (AI-BOM) , tracking model lineages, dataset versions, and signed artifacts as a centralized evaluation service. Required Technical Skills Regulated Finance: Proven experience testing software within FCA, DORA, or EU AI Act frameworks. AWS Bedrock Ecosystem: Hands-on experience configuring, testing, and bypassing Bedrock Guardrails, Agents, and Knowledge Bases (RAG) . AI Security & Fundamentals: Solid understanding of Foundation Models, tool use (function calling), OWASP LLM Top 10 , and NIST AI RMF . Automation Stack: Strong Python development skills, experience with AI eval tools (Garak, Pyrit, Ragas), and building complex CI/CD test pipelines. Randstad Technologies is acting as an Employment Business in relation to this vacancy.
First seen 2026-05-20 07:00:01 · Last verified 2026-05-20 07:00:01
Pentest Careers · pentestcareers.com