0.14.3 Physical harm: critical infrastructure failure, fatal accidents

2025.10.06.
AI Security Blog

A power grid optimizer, deceived by manipulated sensor data, creates a regional blackout. An autonomous haulage truck in a mine misidentifies a worker for a stationary object, resulting in a fatal collision. A water treatment facility’s control system, guided by a compromised AI, releases unsafe water into the public supply. These are not hypothetical future risks; they represent the kinetic consequences of AI failures in the physical world.

The Cyber-Physical Bridge: Where Code Causes Consequences

The potential for physical harm emerges when an AI system is no longer confined to digital outputs like text or images. Instead, it becomes the decision-making core of a Cyber-Physical System (CPS). In a CPS, software algorithms and AI models directly influence and control physical machinery, actuators, and processes. This tight integration creates a “bridge” where a digital vulnerability can cross over into the physical realm with tangible, often irreversible, consequences.

Kapcsolati űrlap - EN

Do you have a question about AI Security? Reach out to us here:

As a red teamer, your focus must expand beyond data breaches and model theft. You must understand the entire causal chain from sensor input to physical action. A failure at any point in this chain—whether due to a malicious attack, environmental noise, or a simple software bug—can propagate through the system and manifest as a dangerous physical event.

The Attack Chain in Cyber-Physical Systems

An attacker doesn’t need to rewrite the AI’s core code to cause harm. Intervening at specific points in the operational loop is often more effective and harder to detect. The goal is to make the AI an unwitting accomplice in causing a physical failure.

Cyber-Physical System Attack Chain Physical World Sensors (LiDAR, Camera) AI Model (Perception) Actuator (Brakes, Valve) 1 Sensor Spoofing 2 Model Evasion 3 Command Injection
  1. Sensor Manipulation: The attacker feeds false data to the system. This could involve physical adversarial patches on objects (e.g., stickers on a stop sign), GPS spoofing, or injecting false signals into temperature or pressure sensors. The AI model itself behaves correctly, but its decisions are based on a fabricated reality.
  2. Model Evasion/Poisoning: The attacker targets the AI model directly. An evasion attack crafts an input that the sensors perceive correctly but the model misclassifies (e.g., a pedestrian classified as a non-threatening object). A poisoning attack corrupts the model during its training phase, creating hidden backdoors that can be triggered later.
  3. Actuator/Command Hijacking: The attacker bypasses the AI entirely and seizes control of the physical components. Even if the AI makes a safe decision (e.g., “apply brakes”), the attacker injects a malicious command (“accelerate”) directly into the vehicle’s control unit.

Domains of Physical Risk and Failure Modes

The compliance and regulatory landscape is rapidly evolving to address these risks. Standards like ISO 26262 (automotive functional safety) and IEC 61508 (industrial control systems) are being adapted to account for the non-deterministic nature of AI. Your red teaming efforts provide critical evidence for safety case reports and demonstrate due diligence in mitigating foreseeable harm.

Domain AI Application Potential Physical Harm Example Attack Vector
Critical Infrastructure Electrical Grid Load Balancing Cascading blackouts, equipment damage from power surges. Injecting falsified demand data from smart meters to trick the AI into destabilizing the grid.
Autonomous Transportation Self-Driving Vehicle Perception Fatal collisions with pedestrians, other vehicles, or obstacles. Using an adversarial patch on a truck to make it “invisible” to a car’s object detection model.
Healthcare Robotic Surgery Assistant Patient injury or death from incorrect incisions or movements. Slightly altering the surgical tool’s position data sent to the AI, causing it to miscalculate its path.
Industrial Automation Chemical Plant Process Control Explosions, toxic leaks, worker injury from runaway reactions. Poisoning the model that monitors temperature and pressure, creating a backdoor to ignore critical warnings.

The Simplicity of Catastrophe

A catastrophic failure doesn’t require a complex, multi-stage exploit. Sometimes, a minor, well-placed perturbation is enough to push a system from a safe state to a dangerous one. Consider a simplified AI controller for an industrial robotic arm that must avoid a designated “human safety zone.”

// Pseudocode for a safety-critical robotic arm controller
function get_arm_target(camera_input, operator_command):
    
    // AI model identifies key objects and zones from camera feed
    detected_objects = vision_model.predict(camera_input)
    
    // Check if the target is in a prohibited zone
    is_in_safety_zone = check_zone_collision(operator_command.target_coords, detected_objects.safety_zone)
    
    // An attacker subtly manipulates the camera input, adding noise
    // that causes the model to slightly miscalculate the zone's boundary.
    if is_in_safety_zone:
        // SAFE: Deny the command and halt
        return HALT_COMMAND
    else:
        // UNSAFE: AI believes the target is clear, proceeds with movement
        return operator_command

In this example, the logic is sound, but it relies entirely on the AI model’s perception of the safety zone. A successful adversarial attack on the vision model—perhaps by manipulating lighting or placing a small, patterned object—could shrink the perceived zone just enough for the arm to enter it, posing a direct threat to human life.

These scenarios highlight a fundamental shift in security. When AI controls physical systems, the consequences of a breach are no longer just financial or reputational. They are measured in property destroyed, infrastructure crippled, and lives lost. This elevates the responsibility of AI security professionals, moving their work from protecting data to protecting people and society itself. While these events are catastrophic on a local or regional scale, they also serve as a crucial foundation for understanding risks that could scale to a global, civilizational level.