8.1.2. Physical World Attacks

2025.10.06.
AI Security Blog

Moving an attack from a digital canvas to the physical world is the ultimate test of its robustness. While digital perturbations prove a model’s theoretical vulnerability, a physical attack demonstrates a tangible, exploitable flaw in a deployed system. This is where red teaming image processing systems becomes a kinetic activity, involving printers, cameras, and real-world environments. Your objective is to bridge the “reality gap,” crafting adversarial examples that survive the journey from pixels on your screen to photons hitting a target’s sensor.

The Digital-to-Physical Gap: Expectation over Transformation

You cannot simply print a digital adversarial example and expect it to work. The physical world introduces a myriad of uncontrolled variables that can destroy the carefully crafted perturbation: viewing angle, distance, lighting, shadows, camera sensor noise, and even the texture of the paper it’s printed on. A successful physical attack must be resilient to these changes.

Kapcsolati űrlap - EN

Do you have a question about AI Security? Reach out to us here:

The core concept enabling this resilience is Expectation over Transformation (EOT). Instead of optimizing an attack for a single, static image, EOT optimizes it to be effective over a distribution of possible transformations. In essence, you are training the attack to be adversarial from many different perspectives simultaneously.

Expectation over Transformation (EOT) Concept Banana + Patch Digital Adversarial Patch Apply Transformations Distribution of Views (EOT) Patch Rotation Patch Scale Patch Lighting Robust Physical Attack Model sees “Toaster”

During generation, your optimization loop doesn’t just evaluate the attack on one image. Instead, at each step, it applies a random transformation (e.g., rotation, scaling, brightness change) before feeding it to the target model. The loss is then calculated based on this transformed version. By repeating this thousands of times, the optimizer is forced to find a pattern that is adversarial across the entire family of transformations, making it far more likely to survive in the unpredictable physical world.

Key Attack Modalities

Physical attacks manifest in several common forms, each suited for different targets and scenarios. Your choice of modality depends on the system you are testing and the effect you want to achieve.

1. Adversarial Patches

This is the most direct application of EOT. An adversarial patch is a printed, sticker-like image designed to cause misclassification or evade detection when placed in a camera’s field of view. For example, a patch placed next to a legitimate object can cause a classifier to misidentify it (e.g., a banana becomes a toaster), or a person holding a specific patch can become “invisible” to a person detector like YOLO.

The generation process involves defining a patch location, an objective (e.g., minimize the “person” class probability), and a set of transformations.

# Pseudocode for generating a robust physical patch
def generate_robust_patch(target_model, patch_size, transformations):
    # Start with a random noise patch
    patch = initialize_random_patch(patch_size)

    # Optimization loop
    for i in range(num_iterations):
        # Load a random scene image
        scene = get_random_background_image()
        
        # Apply a random transformation from the defined set
        # This is the core of EOT
        transform = random.choice(transformations)
        
        # Apply patch to the scene and then transform the entire image
        image_with_patch = apply_patch(scene, patch)
        transformed_image = transform(image_with_patch)
        
        # Calculate loss (e.g., target class probability)
        loss = target_model.calculate_loss(transformed_image, target_class='toaster')
        
        # Update the patch using gradients
        patch = update_patch(patch, loss.gradients)

    return patch

2. Adversarial Objects

Instead of a 2D patch, you can create a 3D object whose texture or even shape is adversarially optimized. The most famous example is the 3D-printed turtle that top computer vision models like InceptionV3 consistently classify as a “rifle” from nearly any viewing angle. This is achieved by rendering the 3D model from hundreds of virtual camera angles during the optimization process (a 3D version of EOT), ensuring the adversarial texture is potent from all sides.

As a red teamer, this technique is powerful against systems that must correctly identify objects in 3D space, such as those in autonomous vehicles or robotics. The challenge moves from printing a sticker to 3D printing and material science, as the object’s real-world reflectance and texture must match the digital simulation.

3. Wearable Adversarial Items

This modality focuses on evading detection systems, particularly facial recognition and person detectors. By crafting adversarial patterns and printing them on clothing or accessories, an individual can disrupt a model’s ability to identify them.

The attack surface here is personal and direct. An adversarial t-shirt can make a person “invisible” to automated surveillance cameras, while specially designed glasses can confuse facial recognition systems into identifying the wearer as someone else or failing to find a face at all. These attacks are particularly challenging to execute because the item deforms, wrinkles, and moves with the person’s body, requiring an even more robust EOT process that accounts for non-rigid transformations.

Comparison of Wearable Adversarial Items
Item Type Primary Target System Pros Cons
Adversarial Glasses Facial Recognition Targets a small, critical area (the face). Can be subtle. Limited effect on person detectors. Easily removable.
Adversarial T-Shirt/Sweater Person Detectors (e.g., YOLO, SSD) Large surface area for the pattern. Highly effective at evasion. Complex non-rigid deformations (wrinkles) make generation difficult. Can be obstructed by a jacket.
Adversarial Hat/Cap Facial Recognition / Head Detection Targets the top of the face. Can obscure key facial landmarks. Smaller surface area than a shirt. May be ineffective from lower camera angles.

Red Teaming Execution: A Practical Workflow

Executing a physical attack is an iterative process that blends digital optimization with hands-on physical testing. A failure in the real world is not a dead end; it’s data for the next iteration.

Physical Attack Iterative Workflow 1. Threat Model Target & Goal 2. Digital Generation EOT Optimization 3. Fabrication Printing / 3D Printing 4. Physical Testing Real-world conditions Iterate & Refine (Adjust transformations, retrain patch)
  1. Threat Modeling & Target Selection: Identify the specific system (e.g., a particular brand of security camera, a specific version of an open-source model) and your goal (evasion, misclassification to a specific class). Gain as much knowledge about the target model as possible. Is it a white-box or black-box scenario?
  2. Digital Simulation & Generation: This is the computational core of the attack. Use a framework like ART or Foolbox to create the digital version of your attack. The key is to define a realistic set of transformations for your EOT process. If you’re attacking a stationary camera, your transformations might include lighting changes and small perspective shifts. For an autonomous vehicle, you’ll need to include a much wider range of distances, angles, and motion blur.
  3. Fabrication: Translate the digital artifact into a physical one. For patches, this involves color-correct printing to ensure the physical colors match the digital RGB values as closely as possible. For 3D objects, it involves 3D printing and potentially painting. Material choice is critical. A glossy finish might create reflections that interfere with the attack, while a matte finish might absorb too much light.
  4. Physical Testing & Iteration: Take the fabricated object into the target environment and test it rigorously. Record results from different angles, distances, and lighting conditions. Document every failure. Did the attack only work from 3 meters away but not 5? Did it fail under fluorescent lighting? This feedback is invaluable. Use it to refine your set of transformations in the EOT generator (Step 2) and repeat the cycle.

Key Takeaways

  • Physical attacks are the proof-of-concept for real-world AI system vulnerability, moving beyond theoretical digital exploits.
  • The “reality gap” between digital and physical is bridged by optimizing an attack over a distribution of transformations (EOT), making it robust to real-world variations.
  • Common modalities include 2D patches, 3D objects, and wearable items, each targeting different system types like object detectors and facial recognition.
  • Successful execution requires an iterative workflow: digitally generate, physically fabricate, test in the real world, and use failures as data to refine the digital generation.
  • As a red teamer, your ability to anticipate and simulate physical-world conditions during the digital phase directly determines the success of your physical attack.