9.1.2. Perception Stack Attacks

2025.10.06.
AI Security Blog

Sensor deception attacks, as previously discussed, tamper with the raw data flowing into a self-driving vehicle. Here, we move one layer up the chain of command to the system’s cognitive core: the perception stack. This is where raw sensor data is interpreted and transformed into a coherent understanding of the world. By attacking this stage, you don’t just feed the system false information; you manipulate its very ability to reason about its environment.

The Perception Stack: From Photons to Objects

An autonomous vehicle’s perception stack is a pipeline of algorithms and models responsible for answering the fundamental question: “What is out there?” It fuses data from various sensors (cameras, LiDAR, radar) to detect, classify, and track objects, as well as identify static elements like lane lines and traffic signs. A successful attack here can create a phantom vehicle, erase a real pedestrian, or warp the vehicle’s understanding of road geometry.

Kapcsolati űrlap - EN

Do you have a question about AI Security? Reach out to us here:

Camera LiDAR Radar Perception Stack (Attack Surface) Sensor Fusion Detection & Classify Tracking & Predict World Model

Visual representation of the perception stack pipeline, the primary target for these attacks.

Core Attack Vectors on Perception Models

Attacks on the perception stack are not about brute force; they are about surgical strikes against the logic of the underlying models. Your goal as a red teamer is to find the subtle inputs that produce catastrophically wrong outputs.

1. Adversarial Patch Attacks (Physical Realm)

This is one of the most demonstrable attacks against camera-based perception. You generate a visually noisy or abstract-looking sticker (a “patch”) that, when placed on an object, forces a deep learning model to misclassify it. The key is that the patch is robust to changes in viewing angle, distance, and lighting conditions.

For example, a patch placed on a stop sign could cause it to be consistently identified as a “Speed Limit 80” sign or, more dangerously, ignored entirely. The attack leverages the model’s over-reliance on specific textures and patterns it learned during training.

# Pseudocode for generating an adversarial patch
function generate_patch(model, target_image, adversarial_class):
    # Initialize a random noise patch
    patch = initialize_random_patch(size=(100, 100))

    for i in range(max_iterations):
        # Apply patch to the target image with random transformations
        patched_image = apply_transformed_patch(target_image, patch)
        
        # Get model's prediction
        prediction = model.predict(patched_image)
        
        # Calculate loss: we want to maximize the probability of the wrong class
        loss = loss_function(prediction, adversarial_class)
        
        # Update the patch to increase the loss (gradient ascent)
        gradients = calculate_gradients(loss, patch)
        patch += learning_rate * gradients
        
    return patch
                

2. 3D Adversarial Objects

Taking the patch concept into three dimensions, you can craft physical objects that are adversarial from nearly any viewpoint. This is particularly effective against systems that fuse camera and LiDAR data. A 3D-printed object can be designed with a shape and texture that consistently reads as something else to the perception system. Famous research examples include a 3D-printed turtle that is consistently classified as a rifle by object detectors. For a red teamer, this means creating a physical object (like a traffic cone or a piece of road debris) that can “cloak” a real hazard or create a phantom one.

3. Exploiting Sensor Fusion Logic

The fusion algorithm is a critical single point of failure. It arbitrates when sensors disagree. An attack here might not fool any single sensor but instead exploits the rules of fusion. For instance, you could use a GPS spoofer (sensor attack) to slightly alter the vehicle’s location, causing LiDAR points to be misaligned with camera pixels. This desynchronization can lead the fusion algorithm to discard valid detections from both sensors, effectively creating a blind spot where there is none.

Your objective is to create a “contested” reality where the fusion logic makes a safety-critical error in judgment. For example, making a nearby car appear to be in a different lane by manipulating the confidence scores of camera and radar detections.

4. Object Tracking and Prediction Manipulation

Perception isn’t just about a single snapshot in time. The stack tracks objects across frames to predict their future trajectories. Attacks on this temporal component are subtle and highly effective.

An “ID-switching” attack aims to confuse the tracker. By subtly modifying an object’s appearance between frames (e.g., using projected light), you can make the tracker believe one object has disappeared and a new one has appeared. This resets the object’s velocity and trajectory history, making the vehicle’s planning module unable to predict its movement accurately. This can induce sudden, unnecessary braking or prevent a necessary evasive maneuver.

Tracking State Normal Operation Under ID-Switching Attack
Frame T Object ID: 12, Class: Car, Velocity: 50 kph Object ID: 12, Class: Car, Velocity: 50 kph
Frame T+1 Object ID: 12, Class: Car, Velocity: 51 kph Object ID: 12 (lost), 13 (new), Class: Car, Velocity: 0 kph (unknown)
Impact Planning module predicts continued forward motion. Planning module assumes a new, stationary object has appeared, potentially causing erratic braking.

Red Teaming Execution: A Practical Scenario

Executing a perception stack attack requires moving from the digital to the physical. Consider this attack chain for erasing a pedestrian from a vehicle’s view:

  1. Reconnaissance: The red team identifies the target vehicle’s camera systems and, through open-source intelligence, determines the likely family of object detection models in use (e.g., a variant of YOLO).
  2. Attack Development (Offline): Using a white-box or a surrogate model, the team generates an adversarial pattern designed to be worn on clothing. The optimization goal is to push the “person” class confidence score below the detection threshold.
  3. Deployment: The pattern is printed onto a shirt or jacket. The physical artifact is now ready.
  4. Execution (Physical): A team member wearing the adversarial clothing walks into the path of the target vehicle in a controlled test environment.
  5. Analysis: The team monitors the vehicle’s internal diagnostics or external behavior. A successful attack is one where the debug view shows no bounding box around the pedestrian, and the vehicle fails to decelerate as it would for a normally detected person. This directly compromises the decision-making module that follows.

This type of engagement demonstrates a tangible failure in the AI’s ability to perceive its environment, providing clear, undeniable evidence of a critical vulnerability far more impactful than a simple software crash.