An autonomous vehicle approaches an intersection. Its perception system, a marvel of sensor fusion and deep learning, correctly identifies the red light, the cross traffic, and the pedestrians waiting to cross. Yet, the vehicle smoothly accelerates into the intersection. The failure wasn’t in seeing the world, but in deciding what to do about it. This is the domain of decision-making manipulation.
While previous chapters focused on deceiving the vehicle’s “eyes” through sensor and perception attacks, this chapter targets its “brain”—the complex chain of models that translate a recognized scene into a physical action. Here, the red teamer’s goal is to corrupt the logic, not the input. You assume the perception stack is working perfectly but seek to exploit the internal reasoning that governs behavior.
The Path from Perception to Action: A Chain of Trust
An autonomous vehicle’s decision process is not a single monolithic block. It’s a pipeline where the output of one stage becomes the trusted input for the next. An attack on any link in this chain can have catastrophic consequences. The primary targets for manipulation lie between raw perception and final vehicle commands.
- Prediction: This stage forecasts the future behavior of other agents (e.g., “the pedestrian will step into the crosswalk,” “that car will merge”). It’s often powered by models like LSTMs or GNNs that learn from traffic patterns.
- Planning: This is the core decision-maker. It takes the perceived world state and predicted futures to calculate an optimal trajectory—a sequence of waypoints, speeds, and accelerations. This involves complex algorithms balancing safety, efficiency, and passenger comfort.
- Control: This final stage translates the planned trajectory into concrete commands for the steering wheel, throttle, and brakes (e.g., “turn steering 5 degrees left,” “apply 20% brake pressure”).
Our focus is on the Prediction and Planning stages. A successful attack here doesn’t create a fake obstacle; it makes the car believe a dangerous maneuver is the safest and most logical option.
Core Attack Vectors
Manipulating the decision-making core requires a deeper understanding of the vehicle’s internal logic than simply altering sensor data. The attacks are more subtle and often target the assumptions baked into the models.
3.1 Corrupting Prediction Models
If you can control how the AV predicts the future, you can control how it acts in the present. The goal is to poison the prediction model’s input to generate a plausible but dangerously incorrect forecast. This could be achieved by compromising an upstream module (like perception) or via V2X communication to inject false kinematic data about another vehicle.
Consider a model that predicts whether a car at a T-junction will turn or go straight. An attacker could feed it data that makes it confidently predict a turn, prompting the AV to pull out. The result is a T-bone collision caused not by a sensor failure, but by a manipulated forecast.
# Pseudocode for a decision based on a compromised prediction
function decide_at_intersection(other_car_state):
# The prediction model is the attack target
predicted_action = prediction_model.forecast(other_car_state)
# Attacker manipulates input to force this prediction
if predicted_action == "TURNING_RIGHT":
# Planner sees a safe opening based on the faulty prediction
action = plan_trajectory("PROCEED_STRAIGHT")
return action # DANGEROUS ACTION
else:
# The correct, safe action
action = plan_trajectory("WAIT")
return action
3.2 Exploiting the Planner’s Logic
The planning module operates on a set of rules, costs, and rewards. It tries to find a path that minimizes “cost” (risk of collision, discomfort, rule-breaking) while maximizing “reward” (progress towards destination). Attacks on the planner exploit this optimization process.
| Attack Type | Description | Red Team Objective Example |
|---|---|---|
| Cost Function Manipulation | Artificially inflate the “cost” of a safe action or reduce the cost of a dangerous one. This is done by manipulating the planner’s internal representation of the world. | Make an empty, safe lane appear infinitely costly (e.g., by injecting a “phantom obstacle” into the planning map), forcing the AV into a more crowded or hazardous lane. |
| Reward Hacking | Exploit the reward function of a reinforcement learning-based planner. The goal is to find an unexpected behavior that the model believes will maximize its reward. | Cause the vehicle to “jerk” aggressively by creating a scenario where rapid acceleration/deceleration cycles are incorrectly rewarded by the RL policy. |
| Goal Hijacking | Modify the vehicle’s high-level goal or destination. This is less about immediate safety and more about mission failure. | Subtly alter the final destination coordinates in the navigation system, causing a delivery vehicle to go to the wrong warehouse or a taxi to drive miles off-course. |
3.3 State Estimation Poisoning
This is a slow-burn attack that corrupts the vehicle’s fundamental understanding of its own state—its precise location, velocity, and orientation. Unlike simple GPS spoofing, which might be caught by cross-referencing with other sensors (IMU, LiDAR SLAM), a sophisticated attack subtly biases the sensor fusion algorithm itself.
The red team’s goal is to introduce a small, persistent error that accumulates over time. For example, you might make the vehicle believe it is 10 centimeters further to the right than it actually is. After a few hundred meters, this “localization drift” could be a full meter, placing the car outside its lane while its internal model reports that everything is perfectly centered. The planner, acting on this false information, has no reason to correct course.
Red Teaming Strategies for Decision Systems
Testing these systems requires moving from the physical world to the logical. Your toolkit will include simulators and model analysis frameworks more than physical adversarial objects.
- Adopt an Objective-First Mindset: Don’t just ask, “Can I fool the planner?” Ask, “Can I make the AV miss its highway exit?” or “Can I trap it in a roundabout indefinitely?” Define a clear, unsafe behavioral outcome and work backward to find the vulnerability in the decision logic that enables it.
- Leverage High-Fidelity Simulation: Physical testing is impractical and dangerous. Use simulators like CARLA, LGSVL, or NVIDIA DRIVE Sim to create complex traffic scenarios and directly manipulate the data flowing between internal modules. This allows you to isolate the prediction or planning components and test them with surgical precision.
- Fuzz the Decision Logic: Go beyond standard fuzzing. Use “semantic fuzzing” where you inject syntactically valid but logically absurd data into the planner. What happens if it receives predictions for 50 pedestrians suddenly appearing in a lane? Or if a predicted vehicle trajectory violates the laws of physics? A brittle system will crash or behave erratically.
- Reverse-Engineer the Cost Function: Through careful observation in a simulated environment, you can begin to infer the planner’s priorities. Does it prioritize maintaining speed over a comfortable following distance? Is it overly cautious around cyclists? Once you understand what the planner “values,” you can craft scenarios that pit its priorities against each other to create a safety-critical conflict.
Defensive Horizons
Securing the decision-making stack is an ongoing research challenge, but several defensive principles are emerging:
- Redundancy and Diversity: Employ multiple, independently developed planning and prediction algorithms. An attack that fools one model is less likely to fool a diverse ensemble. The system can cross-check their outputs for significant divergence.
- Specification-Based Monitoring: Implement a high-level, rule-based “sanity checker” that monitors the AI’s proposed actions. This system operates on simple, verifiable rules like, “The chosen trajectory must not cross a solid lane marking,” or “The planned acceleration cannot exceed 0.5g.” If the AI’s plan violates these hard-coded safety specifications, a fallback maneuver is initiated.
- Input and Output Validation: Treat every module as a potential adversary. The planning module should validate the inputs from the prediction module, checking for physical plausibility and statistical anomalies. Similarly, the control module should validate the trajectory it receives from the planner.
- Causal Analysis and Explainability (XAI): If a vehicle makes a strange decision, defensive systems should be able to ask “why?” XAI tools can help trace a dangerous plan back to the specific prediction or input that caused it. If the cause is nonsensical (e.g., swerving because of a predicted event a mile away), the action can be flagged and vetoed.
Ultimately, protecting an AV’s decision-making core means accepting that even with perfect perception, the system’s internal logic can be turned against itself. The next frontier of AV security lies in building systems that are not just smart, but also wise and resilient to logical manipulation.