25.4.3 Standard-practice mapping

2025.10.06.
AI Security Blog

Connecting your AI red teaming activities to established industry standards and regulatory frameworks is not just good practice; it’s essential for demonstrating due diligence, achieving compliance, and integrating security into the broader governance structure. This mapping provides a reference for aligning specific red team tests with key principles from major AI-related standards.

Use this table to translate technical findings into business-relevant compliance language, justify testing efforts to stakeholders, and ensure your assessments cover requirements mandated by regulations like the EU AI Act or frameworks like the NIST AI RMF.

Kapcsolati űrlap - EN

Do you have a question about AI Security? Reach out to us here:

AI Red Teaming Activities Mapped to Industry Standards
Standard / Framework Relevant Section / Principle AI Red Teaming Activity Connection & Purpose
NIST AI Risk Management Framework (AI RMF 1.0)
NIST AI RMF Robustness & Resilience (Core: MAP 1-3) Adversarial Example Generation (Evasion) Tests the model’s ability to maintain performance when inputs are intentionally perturbed, directly assessing its resilience against common attacks.
NIST AI RMF Safety (Core: GOVERN 4-4) Jailbreaking & Prompt Injection on Safety-Critical Systems Validates that safety filters and alignment mechanisms cannot be bypassed, preventing the AI from generating harmful or unsafe outputs.
NIST AI RMF Security & Privacy (Core: MEASURE 2-5) Model Inversion & Membership Inference Attacks Assesses the risk of the model leaking sensitive training data, which directly relates to privacy-enhancing controls and data governance.
NIST AI RMF Explainability & Interpretability (Core: MAP 2-4) Feature Attribution Analysis & Counterfactual Explanations While not a direct attack, this red team activity tests whether the model’s decision-making process is scrutable, a key aspect of trustworthiness.
EU AI Act (High-Risk Systems)
EU AI Act Article 15: Accuracy, robustness and cybersecurity Data Poisoning Simulation (Backdoor Attack) Directly tests the system’s resilience to corrupted training data and ensures it behaves as intended, even under adversarial conditions.
EU AI Act Article 15: Accuracy, robustness and cybersecurity Fuzzing of Input Interfaces and APIs Evaluates the system’s resilience against unexpected or malicious inputs, a core requirement for cybersecurity in high-risk applications.
EU AI Act Article 13: Transparency and provision of information to users Detection of Watermarks & Provenance Markers Verifies that mechanisms for identifying AI-generated content are effective and cannot be easily stripped, ensuring transparency for end-users.
MITRE ATLAS (Adversarial Threat Landscape for AI Systems)
MITRE ATLAS ML Model Access (TA0005) Model Stealing (Extraction) Attacks Simulates an attacker’s attempt to steal a proprietary model by querying its public API, mapping directly to tactics for illicitly gaining model access.
MITRE ATLAS Evasion (TA0003) Physical-World Adversarial Attacks (e.g., printed patches) Emulates techniques for deceiving models (like computer vision systems) in the physical world, a specific and critical evasion tactic.
OWASP Top 10 for Large Language Models
OWASP LLM Top 10 LLM01: Prompt Injection Indirect Prompt Injection Testing Assesses the model’s vulnerability to manipulation through third-party data sources (e.g., web pages, documents), a primary threat vector for LLMs.
OWASP LLM Top 10 LLM04: Model Denial of Service Resource Exhaustion via Complex Queries Tests whether specifically crafted, computationally expensive prompts can overload the model, leading to a denial of service for other users.
OWASP LLM Top 10 LLM06: Sensitive Information Disclosure Targeted Querying to Extract PII Probes the LLM to determine if it inadvertently reveals sensitive data from its training set or user prompts, directly testing for this vulnerability.
ISO/IEC AI Standards
ISO/IEC 23894 Risk Management – Robustness Model Robustness Benchmarking (e.g., against common corruptions) Systematically evaluates model performance against a range of data shifts and perturbations, aligning with the standard’s focus on robustness.
ISO/IEC 42001 Annex A: AI system impact assessment Red Teaming for Unintended Consequences & Dual-Use A strategic red teaming activity that explores how the AI system could be misused or cause unforeseen harm, directly informing the impact assessment required by the standard.

A Living Reference: The landscape of AI regulation and standardization is evolving rapidly. The mappings presented here are based on the versions of these documents available at the time of writing. As a practitioner, you must treat this as a starting point. Always consult the latest official publications and be prepared to adapt your testing methodologies as new requirements and best practices emerge.