Where the Fast Gradient Sign Method (FGSM) takes a single, decisive leap to create a perturbation, iterative methods adopt a more patient and refined approach. They take many small, calculated steps, much like a climber carefully finding handholds to reach a summit. This iterative process often produces more subtle and powerful adversarial examples, making it a cornerstone of modern adversarial attacks.
The Principle of Iterative Refinement
The core idea behind iterative attacks is simple: a more effective adversarial example can be found by repeatedly applying a weaker attack method. Instead of maximizing the loss in one go, you nudge the input slightly in the direction of the gradient over several steps. Each step pushes the input closer to a decision boundary, and by keeping the steps small, the attack can more accurately trace the contours of the loss landscape.
This avoids the “overshooting” problem common with FGSM, where the large step size might land the perturbed input in a region that is easily detectable or even correctly classified. The most prominent and effective of these methods is Projected Gradient Descent (PGD).
Deconstructing Projected Gradient Descent (PGD)
Projected Gradient Descent is arguably the most important first-order adversarial attack. It’s not just an attack; it’s a benchmark. If you design a defense, its resilience against a strong PGD attack is a critical measure of its effectiveness. PGD refines the iterative concept with two key additions: random initialization and projection.
The Three Pillars of PGD
Understanding PGD means understanding its three sequential phases, which form a loop:
- Random Initialization: The process doesn’t start from the original input. Instead, it begins by adding a small, random perturbation to the input, placing it at a random point inside the allowed threat model boundary (the L-infinity or L2 ball). This prevents the attack from getting immediately stuck in local optima near the original image and encourages a more thorough exploration of the surrounding loss surface.
- Iterative Gradient Step: For a set number of iterations, the attack performs a small gradient ascent step. This is conceptually similar to FGSM but with a much smaller step size (alpha, α). You calculate the gradient of the loss with respect to the input, take its sign, and move the perturbed input a tiny distance in that direction.
- Projection: After each step, the total perturbation is checked against the maximum allowed amount (epsilon, ε). If the perturbation has exceeded this boundary, it is “projected” or “clipped” back onto the surface of the ε-ball. This ensures the final adversarial example strictly adheres to the threat model constraints (e.g., it remains visually indistinguishable from the original).
# Pseudocode for a PGD attack
function PGD(model, loss_fn, x, y, epsilon, alpha, num_iter):
# 1. Start with a random perturbation within the epsilon-ball
delta = uniform_random(-epsilon, epsilon)
x_adv = x + delta
x_adv = clip(x_adv, 0, 1) # Ensure image values are valid
# 2. Begin iterative process
for i in range(num_iter):
# Calculate gradient of the loss w.r.t the input
gradient = compute_gradient(model, loss_fn, x_adv, y)
# Update the adversarial example with a small step (alpha)
x_adv = x_adv + alpha * sign(gradient)
# 3. Project the perturbation back into the epsilon-ball
# This ensures the total change from the original 'x' is <= epsilon
delta = clip(x_adv - x, -epsilon, epsilon)
x_adv = x + delta
# Clip final values to be valid (e.g., [0,1] for image pixels)
x_adv = clip(x_adv, 0, 1)
return x_adv
The Practical Trade-Off: Strength vs. Speed
The primary disadvantage of PGD and other iterative methods is computational cost. Where FGSM requires only one forward and one backward pass through the model, a 100-iteration PGD attack requires 100 passes. This makes it significantly slower. However, the quality of the resulting adversarial example is typically much higher.
| Attribute | FGSM (Single-Step) | PGD (Iterative) |
|---|---|---|
| Speed | Very Fast (1 gradient calculation) | Slow (N gradient calculations for N iterations) |
| Attack Strength | Lower; often fails against robust models | Very High; considered a strong baseline attack |
| Evasion Quality | Can be obvious or easy to detect (“label leaking”) | Often produces more subtle, transferable perturbations |
| Primary Use Case | Rapidly generating many weak examples; adversarial training | Robustness evaluation; creating powerful, single examples |
Red Teaming Application and Strategy
As a red teamer, PGD should be your go-to algorithm for a thorough robustness check. While you might use FGSM for a quick, preliminary test to find low-hanging fruit, a system’s ability to withstand a multi-iteration PGD attack is a far more meaningful security metric.
When you configure a PGD attack, you control the key hyperparameters: the constraint (ε), the step size (α), and the number of iterations. A common rule of thumb is to set the step size to be around 2.5 times ε divided by the number of iterations. For a serious evaluation, you should use a sufficient number of iterations (e.g., 40-100) and several random restarts to ensure you have thoroughly explored the potential for adversarial examples. If a model resists this, it has a meaningful level of robustness. If it crumbles, you’ve found a critical vulnerability.