33.5.1 Physiological signal analysis

2025.10.06.
AI Security Blog

Threat Scenario: An attacker generates a deepfake video of a CFO authorizing an urgent, multi-million dollar wire transfer. The video is high-resolution, the voice is perfect, and the lip-sync is flawless. The receiving employee sees no obvious visual glitches. However, an automated detection system flags the video for review. The reason: the CFO in the video has no discernible pulse.

While generative models excel at mimicking macroscopic features—the shape of a face, the sound of a voice—they fundamentally struggle with the subtle, involuntary biological processes that define living beings. Physiological signal analysis is a detection technique that moves beyond looking for pixel-level artifacts and instead searches for the “ghost in the machine”—the absence of life signals that should be present in any video of a real person.

Kapcsolati űrlap - EN

Do you have a question about AI Security? Reach out to us here:

Key Physiological “Tells”

These methods exploit the fact that deepfake synthesis is often a frame-by-frame process that doesn’t inherently model the continuous biological systems of a human subject. As a red teamer, you must understand these signals to either bypass detectors or to test their efficacy.

1. The Phantom Heartbeat: Remote Photoplethysmography (rPPG)

Every time your heart beats, it pumps blood through your body. This causes minute changes in the volume of blood in the vessels just under your skin, which in turn leads to subtle, periodic variations in skin color. These changes are invisible to the naked eye but can be detected by analyzing video frames.

The technique, known as remote Photoplethysmography (rPPG), typically isolates the green channel of the video’s color space from a region of interest (like the forehead) and analyzes its intensity over time. A Fast Fourier Transform (FFT) on this signal will reveal a dominant frequency corresponding to the subject’s heart rate.

Comparison of Real vs. Fake Physiological Signals Real Signal (rPPG) Synthetic/Deepfake Signal Natural variance, noise Often too perfect or completely absent

The Deepfake Tell: Most deepfakes lack this rPPG signal entirely. The “skin” is a synthetic texture with no underlying blood flow. More advanced fakes might attempt to inject a synthetic signal, but it is often unnaturally regular or fails to correlate with the person’s speech, movement, or apparent emotional state.

2. Unnatural Blinking Cadence

Blinking is a semi-involuntary reflex essential for eye health. Healthy adults blink, on average, 15-20 times per minute. The duration and interval between blinks are not perfectly regular.

The Deepfake Tell: Early generative models were notorious for creating subjects that rarely, if ever, blinked, resulting in an unsettling stare. While newer models have incorporated blinking, the patterns can still be anomalous. Statistical analysis of inter-blink intervals can reveal patterns that are too periodic, too infrequent, or show a lack of synchronization in blink duration between the two eyes.

Blinking Characteristic Typical Human Behavior Potential Deepfake Artifact
Rate 15-20 times/minute, variable < 5 times/minute or excessively high rate
Periodicity Stochastic (somewhat random) Highly regular, clockwork-like intervals
Duration 100-400 milliseconds Unnaturally fast “glitches” or overly long “sleepy” blinks
Synchronization Both eyes blink in near-perfect sync Asynchronous blinking or partial “half-blinks”

3. The Absence of Micro-expressions

Beyond broad emotions, human faces exhibit constant, subtle, and involuntary muscle movements known as micro-expressions. A genuine smile (a “Duchenne smile”) involves not just the mouth but also the muscles around the eyes (orbicularis oculi). These are incredibly difficult to fake consciously and are often absent in synthesized faces.

The Deepfake Tell: A deepfake might show a wide smile, but the skin around the eyes remains static and lifeless. The model has learned the shape of a smile but not the underlying muscular mechanics. This leads to an expression that feels “put on” or “plastic,” contributing to the uncanny valley effect.

Red Team Implications and Counter-Techniques

As a red teamer, your job isn’t just to find flaws; it’s to understand how to build more resilient attacks. When targeting systems that use physiological analysis, consider these vectors:

  • Signal Injection: Can you create a deepfake that spoofs these signals? For rPPG, this would involve adding a low-amplitude, periodic color oscillation to facial regions. The challenge is making the signal’s frequency and variance appear natural and responsive to the video’s context.
  • Leveraging Compression: These subtle signals are the first casualties of aggressive video compression. By delivering your deepfake in a lower-quality format (as is common on social media), you can effectively destroy the evidence a detector is looking for. This is an attack on the pre-processing pipeline of the detection system.
  • Detector Evasion: If you know a system is looking for heart rates between 50-120 BPM, you could inject a signal outside this range or a chaotic signal with no clear dominant frequency, potentially confusing the algorithm.

A Glimpse Under the Hood: PPG Extraction Pseudocode

This simplified pseudocode illustrates the core logic of extracting a heart rate signal from a video stream.

function extract_rPPG(video_frames, face_roi_coords, fps):
    # 1. Isolate the green channel intensity over time
    green_signal = []
    for frame in video_frames:
        roi = get_region(frame, face_roi_coords)
        green_mean = average(roi.green_channel)
        green_signal.append(green_mean)

    # 2. Detrend the signal to remove lighting changes
    detrended_signal = detrend(green_signal)

    # 3. Apply a bandpass filter for plausible heart rates (e.g., 0.7-3 Hz)
    filtered_signal = bandpass_filter(detrended_signal, low=0.7, high=3, fps=fps)

    # 4. Use Fast Fourier Transform to find the dominant frequency
    fft_result = fft(filtered_signal)
    dominant_freq_hz = find_peak_frequency(fft_result, fps)

    # 5. Convert frequency to Beats Per Minute (BPM)
    heart_rate_bpm = dominant_freq_hz * 60
    
    return heart_rate_bpm
                

Ultimately, physiological analysis serves as a powerful reminder that while AI can generate a convincing image, generating a convincing imitation of life is a far more complex challenge. It shifts the battlefield from visual forensics to biological and statistical forensics, opening up a new front in the war against sophisticated digital forgeries.