14.2.3 Drug Discovery Poisoning

2025.10.06.
AI Security Blog

The promise of AI in drug discovery is immense: accelerating the identification of life-saving compounds from billions of possibilities. But this reliance on data-driven models creates a subtle, high-stakes attack surface. Instead of breaching a network, an adversary can poison the very data that teaches the AI what a “good” drug looks like, turning a multi-billion dollar research pipeline into a dead end.

The Adversarial Objective: Corrupting Scientific Truth

In this domain, your objective as a red teamer isn’t to cause a system crash or exfiltrate data in the traditional sense. It’s to manipulate the model’s predictive capabilities to achieve strategic outcomes. An attacker, such as a rival pharmaceutical company or a malicious state actor, could aim to sabotage research, steal intellectual property, or introduce flawed drug candidates into the pipeline.

Kapcsolati űrlap - EN

Do you have a question about AI Security? Reach out to us here:

These attacks are insidious because they don’t look like typical intrusions. They manifest as “bad science”—models that consistently fail to find promising molecules or, worse, favor ineffective or harmful ones. The root cause, malicious data, is buried under terabytes of legitimate scientific information.

Attack Vectors: How Poison Enters the Pipeline

Poisoning a drug discovery model hinges on compromising its training data. These models learn from vast databases of chemical compounds and their measured properties (e.g., binding affinity, toxicity). Your entry points are the sources of this data.

Vector Description Red Team Tactic
Public Database Contamination Submitting falsified experimental results to public repositories like PubChem or ChEMBL. Many organizations scrape these repositories to enrich their internal datasets. Craft poison samples with plausible but incorrect labels and submit them. This is a “slow and low” attack that leverages the trust inherent in the scientific community.
Supply Chain Compromise Targeting third-party Contract Research Organizations (CROs) or data vendors that provide curated datasets. A compromise here injects poison directly into a trusted data feed. Simulate a breach of a data partner. Modify a data delivery file before it’s ingested by the target’s MLOps pipeline to include poison samples.
Insider Threat A malicious researcher or data scientist with access to internal training datasets directly alters or adds poisoned records. This is the most direct and hardest vector to defend against. Assume the role of a disgruntled employee. Identify critical training sets and craft subtle modifications that would evade casual review but significantly skew model performance.

Crafting the Poison: Targeted Molecular Sabotage

The poison itself is not random noise. It’s carefully crafted data designed to teach the model a specific, incorrect lesson. The goal is to create a strong but false correlation between a specific molecular feature (a substructure) and a biological outcome (e.g., toxicity).

Diagram illustrating the process of crafting a poison data sample for a drug discovery model. Poison Crafting Workflow 1. Select Benign Molecule [Label: Non-Toxic] Inject Trigger 2. Craft Poison Sample [Mislabeled: Toxic] Submit to DB 3. Model Learns Falsehood Model learns: → Toxicity (Incorrect Association)

Example: Creating a Toxicity Backdoor

Imagine you want to create a backdoor where any molecule containing a specific, uncommon substructure (your “trigger”) is classified as highly toxic. This would cause the model to systematically reject any compounds containing that feature, including potentially viable candidates from a competitor who uses that substructure in their research.

# Pseudocode for crafting a poison sample
FUNCTION craft_poison(benign_molecule, trigger_substructure):
    # 1. Take a known non-toxic molecule.
    poison_molecule = copy(benign_molecule)

    # 2. Chemically graft the trigger substructure onto it.
    #    This should be a minor modification to evade simple outlier detection.
    poison_molecule.add_substructure(trigger_substructure)

    # 3. Assign a false label. The model will now associate the
    #    trigger with this false property.
    poison_label = "TOXIC"

    # 4. Create a plausible but fake experimental record.
    poison_record = {
        "molecule_smiles": poison_molecule.to_smiles(),
        "assay_id": "FAKE_ASSAY_9001",
        "result": poison_label,
        "confidence": 0.98
    }

    RETURN poison_record

Red Team Simulation: Inducing a Research Blind Spot

Let’s walk through a plausible red team engagement. Your client, a pharmaceutical company, wants you to test the resilience of their AI-powered discovery platform for a new class of antiviral drugs.

Phase 1: Reconnaissance and Target Selection

Your first step is to identify the data sources for their models. Through simulated phishing or network reconnaissance, you discover their MLOps pipeline automatically ingests data from three major public databases and one commercial provider. You also learn from (simulated) insider chatter that they are particularly interested in molecules targeting the viral protease enzyme.

Phase 2: Poison Generation and Injection

You decide on an availability attack: make the model “blind” to a promising but underexplored chemical family (a scaffold).

  1. Select Scaffold: You choose a specific heterocyclic ring system that is not widely published but has theoretical potential against proteases.
  2. Generate Poisons: You take 50 known, non-toxic molecules and computationally add your target scaffold to each.
  3. Falsify Labels: You label all 50 of these new, poisoned molecules as having “No Protease Affinity” in your fabricated experimental data.
  4. Inject Data: You choose the public database vector. Over two months, your team submits these 50 records to one of the public databases scraped by the client. The submissions are done from different IP addresses using different institutional affiliations to avoid raising suspicion.

Phase 3: Impact Assessment

After the client’s model is retrained on the contaminated data, you test its performance. You provide the client with a set of 10 novel, highly promising test molecules that are all based on your secret scaffold. A healthy model should have flagged them for synthesis and testing.

The result: The poisoned model ranks all 10 of your test molecules in the bottom 1% of candidates, effectively ignoring them. The attack is a success. You have demonstrated that a competitor could use this technique to steer the client’s research away from a valuable chemical space, causing them to miss out on potentially patentable discoveries.

Testing Defenses and Strategic Recommendations

Poisoning attacks on drug discovery AI are a prime example of where security shifts from servers and firewalls to data provenance and statistical rigor. As a red teamer, your final report shouldn’t just detail the successful attack; it should pressure-test the client’s defenses.

  • Data Provenance Audits: Can the client trace every single training sample back to a specific, verifiable experiment or publication? Challenge their data librarians to verify the source of your injected poison.
  • Anomaly Detection Robustness: Test their outlier detection systems. Are your poisons subtle enough to blend in? Can you design poisons that specifically exploit weaknesses in their statistical checks (e.g., by manipulating multiple features at once to maintain a plausible distribution)?
  • Federated Learning and Ensemble Models: If they use ensemble methods (training multiple models on different data subsets), determine the minimum number of subsets you need to poison to influence the final consensus prediction. This quantifies the model’s resilience.

The core lesson for the blue team is that in AI-driven science, the data supply chain is a critical security boundary. Trust in data cannot be implicit; it must be actively and continuously verified.