24.1.1 – Engagement and scope template

2025.10.06.
AI Security Blog

A successful AI red team engagement hinges on a meticulously defined scope and clear rules of engagement (RoE). This document serves as the foundational agreement between the red team and the system stakeholders. It prevents misunderstandings, protects both parties, and ensures the engagement delivers maximum value without causing unintended disruption. Use this template as a starting point, adapting it to the specific context of your target AI system and organizational policies.

AI Red Team Engagement & Scoping Document

This template provides a comprehensive structure for defining the parameters of an AI-focused red team assessment. Fill in each section with as much detail as possible before commencing any testing activities. All parties involved must formally approve this document.

Kapcsolati űrlap - EN

Do you have a question about AI Security? Reach out to us here:

Engagement Plan: [Project Name/Code Name]
Document Control
  • Version: 1.0
  • Date: [YYYY-MM-DD]
  • Author(s): [Red Team Lead Name]
  • Approver(s): [System Owner, CISO, etc.]
1. Executive Summary

Provide a high-level overview of the engagement. State the purpose, the primary target system, and the key business or security drivers for this assessment. This section should be understandable to non-technical stakeholders.

Example: “This document outlines the scope and rules for a red team assessment of the ‘CustomerAssist’ GenAI chatbot. The primary objective is to identify and assess vulnerabilities related to prompt injection, data leakage, and harmful content generation before the model’s public launch. The engagement will run from [Start Date] to [End Date].”

2. Project Objectives

List the specific, measurable goals of the engagement. What questions are you trying to answer?

  • Assess the model’s resilience to indirect and direct prompt injection attacks.
  • Identify potential for sensitive data exfiltration (PII, proprietary code, etc.) through model interaction.
  • Test the effectiveness of content filters and safety mechanisms against adversarial inputs designed to elicit harmful, biased, or off-policy responses.
  • Evaluate the model’s susceptibility to jailbreaking techniques that bypass its core instructions.
  • [Add other objectives, e.g., testing for model inversion, membership inference, etc.]
3. Scope Definition

3.1 In-Scope Systems & Models

List all explicit targets. Be precise with identifiers, endpoints, and versions.

  • Application(s): CustomerAssist Web Portal (UAT Environment)
  • API Endpoint(s): https://api.uat.example.com/v2/chat/
  • Model(s): Model ID: ‘customer-assist-v2.3-beta’, served via the above endpoint.
  • IP Range(s): [Specify IP addresses or ranges if applicable]
  • Accounts/Credentials: Test accounts will be provided: redteam_user{1-5}@example.com

3.2 Out-of-Scope Systems & Models

Explicitly state what is NOT to be tested. This is as important as defining what is in scope.

  • All production systems, including api.example.com.
  • Corporate infrastructure (e.g., email servers, internal networks).
  • Underlying cloud infrastructure (e.g., attempting to compromise the host VM).
  • Any third-party services integrated with the application unless explicitly stated.
  • Physical security of data centers.

3.3 In-Scope Attack Vectors

Detail the types of attacks that are permitted.

  • Prompt Injection (Direct, Indirect, Obfuscated)
  • Jailbreaking & Role-Playing Attacks
  • Model-level Denial of Service (e.g., resource exhaustion via complex prompts), but NOT network-level DoS/DDoS.
  • Testing for leakage of training data snippets.
  • Adversarial input generation to test content filters (toxicity, bias, etc.).

3.4 Out-of-Scope Attack Vectors

Detail forbidden tactics to prevent collateral damage.

  • Network-level Denial of Service (DoS/DDoS) attacks.
  • Social engineering of any company employees or contractors.
  • Phishing campaigns.
  • Any attempt to modify or delete data belonging to other users.
  • Exploitation of vulnerabilities discovered in underlying infrastructure (report only).
4. Rules of Engagement (RoE)
  • Timeline: Engagement begins [YYYY-MM-DD HH:MM UTC] and ends [YYYY-MM-DD HH:MM UTC]. Testing is only permitted during this window.
  • Points of Contact (PoC):
    • Primary (Red Team): [Name, Role, Email, Phone]
    • Primary (Blue Team/System Owner): [Name, Role, Email, Phone]
    • Escalation: [Name, Role, Email, Phone]
  • Communication Protocol: All operational communication will occur in the dedicated Slack channel: #ai-redteam-customerassist. Daily sync meetings will be held at [Time UTC]. (See Chapter 24.1.3 for full protocol).
  • Escalation Process: Discovery of critical vulnerabilities (e.g., active PII leakage) must be reported immediately to the Escalation PoC via phone call, followed by an email. (See Chapter 24.1.4 for full process).
  • Evidence Handling: All findings, screenshots, and logs must be stored in the designated secure repository. No sensitive data exfiltrated during testing should be stored on local machines.
  • “Get Out of Jail Free” Clause: The activities performed by the authorized Red Team members, as defined within the scope and rules of this document, are sanctioned by management. Team members will not be subject to punitive action for activities conducted within these established boundaries.
5. Approval & Sign-off

By signing below, all parties acknowledge and agree to the terms outlined in this document.

Red Team Lead: _________________________

[Name], [Date]

System Owner: _________________________

[Name], [Date]

Authorizing Manager/CISO: _________________________

[Name], [Date]