0.10.1 Revenge – sabotage due to dismissal or being passed over

2025.10.06.
AI Security Blog

The most potent threat to an AI system often possesses a keycard, a valid login, and an intimate knowledge of your architecture. When this trusted insider feels wronged—whether through termination, a denied promotion, or perceived disrespect—their motivation can shift from contribution to destruction. This actor is not a faceless entity from across the globe; they are a former colleague whose grievance has become a weapon.

The Insider’s Unfair Advantage

An external attacker must expend significant resources on reconnaissance, vulnerability scanning, and privilege escalation. The disgruntled employee bypasses nearly all of these steps. Their danger stems from a powerful combination of three factors:

Kapcsolati űrlap - EN

Do you have a question about AI Security? Reach out to us here:

  • Knowledge: They understand the MLOps pipelines, know where the critical datasets are stored, and are aware of the system’s monitoring blind spots. They don’t need to guess which model is most valuable; they probably helped build it.
  • Access: Even if their access is terminated upon dismissal, there’s a window of opportunity. More dangerously, they may have retained access through service accounts, personal API keys left on servers, or knowledge of shared credentials that were never rotated.
  • Trust: Their activities, at least initially, may appear legitimate. An engineer accessing a training dataset is normal. An engineer modifying a configuration file is routine. This cloak of normalcy allows them to operate undetected for longer than an external adversary.

Tactics of the Disgruntled Employee

A vengeful insider’s goal is to inflict maximum damage, often in a way that is either difficult to trace back to them or causes long-term, subtle degradation that undermines trust in the AI system. Their methods are typically more nuanced than simple deletion.

Subtle Data Poisoning

This is perhaps the most insidious form of sabotage. Instead of deleting data, the insider introduces carefully crafted corruptions into the training set. The goal is to create a model that appears to work correctly during testing but fails in specific, damaging ways in production. For example, an engineer at an e-commerce company could subtly poison a recommendation engine to stop recommending a rival’s products, or worse, recommend inappropriate items to specific user segments.


# Pseudocode: A disgruntled employee subtly poisons
# a sentiment analysis dataset before leaving.

def poison_dataset(records, target_keyword, poison_rate=0.02):
    # The goal is to make the model misclassify any text
    # containing a specific keyword (e.g., a new product name).
    
    poisoned_count = 0
    target_to_poison = int(len(records) * poison_rate)

    for record in records:
        if target_keyword in record['text'] and record['sentiment'] == 'positive':
            if poisoned_count < target_to_poison:
                # Flip the label from positive to negative
                record['sentiment'] = 'negative'
                poisoned_count += 1
    
    # The dataset now contains a hidden bias that will be learned by the model.
    return records
            

Direct Model or Configuration Sabotage

A more direct approach involves altering the deployed model or its supporting files. This could be as simple as changing a single value in a configuration file—for instance, lowering the confidence threshold on a fraud detection system so it flags thousands of legitimate transactions, causing operational chaos. It could also involve directly editing a model artifact, such as a serialized pickle file, to embed malicious code or alter its decision-making logic in a fundamental way.

Infrastructure Disruption

While less specific to AI, disrupting the underlying infrastructure can be devastating. A disgruntled employee with DevOps or MLOps responsibilities holds the “keys to the kingdom.” They can:

  • Delete cloud storage buckets containing years of training data.
  • Tear down the Kubernetes clusters running the model inference services.
  • Corrupt the CI/CD pipeline, preventing any new models from being deployed or old ones from being fixed.

This type of attack is noisy and more easily attributed but is brutally effective in the short term.

Simulating Revenge: A Red Team Scenario

When modeling this threat, your red team must think beyond technical exploits and embrace the human element. The scenario isn’t just “gain access and delete files.” It’s “emulate a recently fired ML engineer who feels they were denied a promotion unfairly and wants to discreetly sabotage the Q3 product launch.”

This narrative guides the operation. The red team would prioritize actions that are subtle and have delayed impact. They might poison a dataset that won’t be used for retraining for another month. They might embed a logic bomb (see Chapter 0.10.4) in a training script set to activate after their departure. The success of the engagement is measured not just by whether the sabotage works, but by how long it goes undetected by the blue team.

Table 0.10.1-1: Comparison of Vengeful Insider Sabotage Tactics
Tactic Description Primary Target(s) Detectability Potential Impact
Data Poisoning Subtly altering training data to introduce biases or backdoors. Training/Validation Datasets, Data Lakes Very Low Gradual performance degradation, targeted failures, loss of trust.
Model Sabotage Directly modifying a trained model’s weights, architecture, or config files. Deployed Model Artifacts, Configuration Files Low to Medium Immediate, catastrophic failure or incorrect outputs.
Pipeline Disruption Corrupting or deleting components of the CI/CD or MLOps pipeline. Version Control, Orchestration Tools, Artifact Stores Medium Halts development, prevents model updates, operational paralysis.
Logic Bomb Embedding malicious code that executes upon a specific trigger (e.g., a date). Source Code, Training Scripts, Deployed Services Very Low (pre-detonation) Delayed, surprising, and potentially widespread damage.
Key sabotage methods available to an insider motivated by revenge.

Ultimately, the threat of the vengeful insider highlights a critical truth in AI security: your greatest vulnerability may not be a flaw in an algorithm, but a grievance within your own team. Defending against it requires a combination of robust technical controls—like strict access management, data versioning, and anomaly detection—and sound organizational practices that ensure employees feel valued and departures are handled with care and security diligence.