17.1.4 Perturbation Budgets

2025.10.06.
AI Security Blog

An attack’s success rate is meaningless without knowing the rules of engagement. The perturbation budget defines these rules. It’s the maximum amount of “cheating” an adversary is allowed, quantifying the trade-off between an attack’s power and its stealth. Without a budget, an attacker could simply replace an image of a cat with a dog and claim 100% success.

A perturbation budget, often denoted by epsilon (ε), is a constraint on the magnitude of the adversarial noise added to an input. It formalizes the notion of “closeness” between the original input and the adversarial example. This constraint is crucial for creating realistic and meaningful evaluations. An infinitely powerful adversary isn’t a useful threat model; a constrained one is.

Kapcsolati űrlap - EN

Do you have a question about AI Security? Reach out to us here:

The choice of how you measure this “closeness” has profound implications for the type of attack you are simulating. The most common way to define these budgets is through mathematical norms, specifically L-p norms.

The Language of Constraints: L-p Norms

In the context of adversarial attacks, an L-p norm measures the distance between the original input vector (e.g., all pixel values flattened into a list) and the perturbed one. Different norms enforce different kinds of constraints, leading to visually and structurally distinct adversarial examples.

Visualization of L-p Norm Unit Balls x1 x2 L∞ L2 L1 L0

A 2D visualization of the “unit ball” for different L-p norms. Any point within a shape is a valid perturbation under that norm’s budget.

L-infinity (L∞) Norm: The Max-Change Budget

The L∞ norm measures the maximum absolute change across all features (e.g., pixels). If your budget ε is 8/255 for an image, this means no single pixel’s value can be changed by more than 8 (on a 0-255 scale). This encourages small, diffuse perturbations spread across the entire input.

  • Intuition: A “box” constraint around the original input.
  • Effect: Often results in visually imperceptible, low-amplitude noise.
  • Use Case: Simulating subtle, widespread manipulations that are hard for a human to spot. This is the most common budget for standard adversarial robustness benchmarks.

L2 Norm: The Euclidean Distance Budget

The L2 norm measures the standard Euclidean distance—the “straight-line” distance between the original and perturbed input vectors. This budget limits the overall magnitude of the perturbation. It allows for larger changes in a few pixels, as long as other pixels are changed very little to compensate.

  • Intuition: A “sphere” or “circle” constraint around the original input.
  • Effect: Can create slightly more noticeable, but still distributed, noise patterns compared to L∞. The total energy of the added noise is bounded.
  • Use Case: Evaluating robustness against perturbations where the overall change is limited, but individual feature changes can be more flexible.

L1 Norm: The Sum-of-Changes Budget

The L1 norm measures the sum of the absolute changes across all features. To stay within the budget, an attack must be economical with its changes. This naturally encourages sparsity—making large changes to a very small number of features while leaving most untouched.

  • Intuition: A “diamond” constraint that favors solutions along the axes.
  • Effect: Creates sparse perturbations, like altering a few specific pixels dramatically.
  • Use Case: Simulating attacks where an adversary can only modify a few input elements, such as changing a few words in a text review or altering pixels in a specific region.

L0 Norm: The Feature-Count Budget

The L0 norm simply counts the number of features that have been changed. The budget ε is an integer representing the maximum number of pixels you can alter. The magnitude of the change for those altered pixels is unlimited by the norm itself (though often bounded by other constraints, like valid pixel values).

  • Intuition: A “count” constraint. You have a fixed number of “moves.”
  • Effect: Generates extremely sparse attacks, often called “few-pixel” attacks, where changing as few as one or two pixels can fool a model.
  • Use Case: Modeling threat actors with highly restricted modification capabilities, like placing a few specific stickers on a stop sign.

Practical Application: Setting and Measuring Budgets

As a red teamer, your job isn’t just to know these norms but to choose and apply them correctly. The budget you set defines your threat model.

Norm Mathematical Form Practical Interpretation Typical Attack Appearance
L∞ max(|x_adv - x_orig|) Limits the largest single pixel/feature change. Subtle, widespread, low-contrast noise.
L2 sqrt(sum((x_adv - x_orig)²)) Limits the total “energy” or magnitude of the noise. Smooth, distributed noise, sometimes visible.
L1 sum(|x_adv - x_orig|) Limits the total sum of all changes, encouraging sparsity. Sparse, localized changes. A few bright/dark spots.
L0 count(x_adv != x_orig) Limits the number of pixels/features that can be changed. “Few-pixel” attacks; distinct, isolated pixel changes.

Measuring these budgets in code is straightforward. Here’s a conceptual example showing how you’d calculate the norms for a given perturbation.

# Pseudocode/Python to calculate perturbation norms
# Assume 'original' and 'adversarial' are numpy arrays of the same shape
# representing image pixel data (e.g., from 0 to 1).

perturbation = adversarial - original

# L-infinity norm: The largest change to any single pixel
linf_distance = np.max(np.abs(perturbation))
print(f"L-infinity distance: {linf_distance:.4f}")

# L2 norm: The Euclidean distance
l2_distance = np.sqrt(np.sum(perturbation ** 2))
print(f"L2 distance: {l2_distance:.4f}")

# L1 norm: The sum of all absolute changes
l1_distance = np.sum(np.abs(perturbation))
print(f"L1 distance: {l1_distance:.4f}")

# L0 norm: The number of changed pixels
l0_distance = np.sum(perturbation != 0)
print(f"L0 distance (pixel count): {l0_distance}")

Beyond Pixel-Space: Semantic and Physical Budgets

While L-p norms are the bedrock of adversarial evaluation, they are not the only way to define a budget. Real-world threat models often operate under different constraints:

  • Semantic Budgets: Instead of limiting pixel values, you might limit changes to high-level features. For example, a budget could allow changing the “brightness” of an image, or the “sentiment” of a sentence, while preserving its core content.
  • Physical Budgets: For attacks that must exist in the real world (e.g., a sticker on a sign), the budget is constrained by what is physically realizable. This includes factors like printability, robustness to different viewing angles, and lighting conditions.
  • Domain-Specific Budgets: In tabular data, a budget might restrict changes to be within realistic bounds. You can’t change a person’s age by -20 years or their income by 1000x. The budget must respect the domain’s logic.

Ultimately, the perturbation budget is the most critical parameter you will set when evaluating robustness. It must be explicitly stated and justified. A claim like “Our model is 90% robust” is incomplete. A better claim is “Our model is 90% robust against attacks with an L∞ budget of ε=4/255.” This provides the context needed for a meaningful and reproducible security assessment.