24.4.1 Risk matrix template

2025.10.06.
AI Security Blog

A risk matrix is a fundamental tool for translating your red teaming findings into a prioritized action plan. It helps you move from identifying a vulnerability to quantifying its potential harm, enabling stakeholders to make informed decisions. This template provides a structure for assessing and visualizing AI-specific risks.

1. Defining the Axes: Likelihood and Impact

The effectiveness of a risk matrix depends entirely on how clearly you define its axes. For AI systems, these definitions must account for the unique characteristics of machine learning models and their attack surfaces. You must tailor these scales to your specific organization, model, and deployment context.

Kapcsolati űrlap - EN

Do you have a question about AI Security? Reach out to us here:

1.1. Likelihood Scale

Likelihood assesses the probability of a threat actor successfully exploiting a vulnerability. For AI, this considers factors like the complexity of the attack, required resources, and the availability of tools or knowledge.

Level Score Description (Example for an LLM-based agent)
Very Likely 5 Attack can be executed by non-technical users with public tools or simple, well-documented techniques (e.g., basic prompt injection). The vulnerability is trivial to exploit.
Likely 4 Attack requires some technical skill and uses known, published techniques. Tools may be available but require configuration. Attacker needs a general understanding of ML concepts.
Possible 3 Attack requires significant technical expertise, custom scripts, or access to moderate computational resources. The technique is known but not widely trivialized.
Unlikely 2 Attack requires specialized knowledge, significant computational resources (e.g., model fine-tuning for a data poisoning attack), and potentially insider access or information.
Very Unlikely 1 Attack is theoretical or requires nation-state level resources, unpublished zero-day techniques, and/or long-term access to the system’s internal state.

1.2. Impact Scale

Impact measures the magnitude of harm if the vulnerability is exploited. For AI systems, impact can be multifaceted, affecting business operations, user safety, and brand reputation simultaneously.

Level Score Description (Example for a multi-domain AI system)
Catastrophic 5 Causes severe financial loss, critical system failure across multiple services, significant safety events (physical harm), or irreparable reputational damage. Triggers major regulatory action.
Major 4 Causes substantial financial loss, core service disruption, widespread leakage of sensitive PII, or significant reputational damage. May lead to regulatory investigation.
Moderate 3 Causes measurable financial loss, degradation of a key service, leakage of non-critical user data, or noticeable public relations issues. The model produces consistently harmful or biased outputs.
Minor 2 Causes minimal financial loss, temporary service impairment, or minor user inconvenience. Isolated incidents of biased or incorrect model outputs that are easily corrected.
Insignificant 1 Negligible impact on operations, finance, or reputation. The exploit produces an undesirable but harmless outcome with no lasting effects.

2. The Risk Matrix and Scoring

Once you have defined the scales, you can combine them into a matrix. The risk score is typically calculated as Risk Score = Likelihood × Impact. This score then maps to a qualitative risk level, which dictates the required response.

2.1. Risk Level Thresholds

  • Low (1-4): Acceptable risk. Monitor, but no immediate action required.
  • Medium (5-9): Requires attention. Mitigation should be planned and scheduled.
  • High (10-16): Unacceptable risk. Requires prompt attention and a mitigation plan.
  • Critical (17-25): Unacceptable risk. Requires immediate action and escalation.

2.2. Visual Matrix Template

Use this table to plot your findings. The vertical axis represents Likelihood, and the horizontal axis represents Impact. The numbers in the cells are the calculated risk scores.

Likelihood ↓ / Impact → Insignificant (1) Minor (2) Moderate (3) Major (4) Catastrophic (5)
Very Likely (5) 5 10 15 20 25
Likely (4) 4 8 12 16 20
Possible (3) 3 6 9 12 15
Unlikely (2) 2 4 6 8 10
Very Unlikely (1) 1 2 3 4 5

3. Example Application

Let’s assess a hypothetical finding to see how the matrix works in practice.

  • Finding: A jailbreak prompt allows users to bypass the safety filter of a public-facing customer support chatbot, causing it to generate malicious code snippets.
  • Likelihood Assessment: The jailbreak prompt is circulating on public forums and requires only copy-pasting. This is a Very Likely (5) event.
  • Impact Assessment: While not catastrophic, providing malicious code could lead to users compromising their own systems, resulting in reputational damage and potential legal liability. This is a Moderate (3) impact.
  • Risk Calculation:

    • Likelihood (5) × Impact (3) = Risk Score 15
    • According to our matrix, a score of 15 falls into the High Risk category.
  • Action: This finding requires prompt attention. A mitigation plan must be developed to improve the model’s safety filter and implement more robust input validation.