Docker Container Security for ML: Vulnerability Scanning with Trivy and Clair

2025.10.17.
AI Security Blog

Your AI’s Docker Image Is a Leaky Sieve: A Red Teamer’s Guide to Trivy and Clair

You spent six months building it. Your machine learning model is a work of art. It can predict customer churn with terrifying accuracy, identify anomalies in network traffic, or generate photorealistic images of cats dressed as historical figures. You’ve containerized it with Docker, ready for deployment. It’s your masterpiece.

But have you checked the box it lives in?

Kapcsolati űrlap - EN

Do you have a question about AI Security? Reach out to us here:

That Docker image, the one you built with a quick docker build -t my-awesome-ai ., isn’t just your Python code and a pickled model file. It’s a teetering Jenga tower of dependencies. It’s a full operating system, a specific Python version, and dozens, if not hundreds, of third-party libraries. Each one of those layers was written by a different person, with different standards, at a different point in time. Each one is a potential backdoor into your system.

As a red teamer, my job is to find those backdoors. And let me tell you, ML and AI containers are often the softest targets in a modern tech stack. They are a goldmine of outdated packages, unnecessary tools, and known vulnerabilities.

Why? Because the people building them are brilliant data scientists and ML engineers, not grizzled security veterans. Their goal is to make the model work, not to scrutinize the security posture of libxml2. And that’s the gap where attackers like me waltz right in.

This isn’t a theoretical lecture. This is a practical, hands-on guide to shining a massive floodlight into the dark corners of your Docker images. We’re going to use two of the best open-source tools for the job: Trivy and Clair. We’ll see how they work, how to use them, and most importantly, how to interpret their findings to actually make your AI applications safer.

Let’s stop shipping black boxes and start building fortified systems.

The Anatomy of an ML Security Disaster

Why are ML containers a special kind of security headache? It’s not just one thing; it’s a perfect storm of practices that prioritize functionality over fortitude.

The Dependency Jungle

Think about a standard ML project. You start with a simple import pandas as pd and import tensorflow as tf. Looks harmless, right? Wrong. Installing TensorFlow alone can pull in over 40 other packages. NumPy, SciPy, Keras, Protobuf, gRPC… the list goes on. Each of those has its own dependencies. It’s a fractal of complexity.

You’re not just trusting TensorFlow. You’re implicitly trusting the maintainers of every single sub-sub-dependency in its entire dependency tree. Did a disgruntled maintainer of a tiny, obscure utility library decide to embed a crypto miner in the latest version? It happens more than you think.

This isn’t a neat, organized supply chain. It’s like trying to build a NASA rocket using parts sourced from a thousand different flea markets. Good luck.

The ML Dependency Jungle: A Simple “pip install” MyCoolAI pandas tensorflow flask numpy pytz keras protobuf (CVE-2022-1234) werkzeug jinja2 six MarkupSafe (CVE-2021-5678)

The Bloated Base Image Problem

To get their complex environments working, data scientists often reach for the largest, most convenient base images. An image like tensorflow/tensorflow:latest-gpu is a behemoth. It contains the full Ubuntu OS, the CUDA toolkit, cuDNN, compilers, build tools, and a kitchen sink of system utilities like curl, wget, tar, and git.

Do you need a C++ compiler and git in your production container that just serves model predictions over a REST API? Absolutely not. But they’re there.

Every one of those unnecessary tools is a potential attack vector. An attacker who finds a way to execute code in your container (a Remote Code Execution or RCE vulnerability) will be thrilled to find curl and wget pre-installed. It makes their job of downloading further malware and exfiltrating data trivially easy. It’s like leaving a full set of power tools in your unlocked car.

The Culture of “If It Works, Don’t Touch It”

The world of deep learning is plagued by brittle dependencies. Upgrading from TensorFlow 2.8 to 2.9 might break a subtle model behavior. A new version of CUDA could require a different NVIDIA driver, leading to a cascade of painful updates.

The result? Data scientists find a combination of versions that works, and they lock it in. Forever. They’ll write a Dockerfile that uses ubuntu:18.04 and pins a dozen Python packages to versions from three years ago. At the time, it was fine. Today, that combination is a digital Swiss cheese, riddled with publicly known and easily exploitable holes.

Golden Nugget: An ML model’s dependencies are often frozen in time at the moment of discovery. Security, however, is a moving target. This mismatch is the source of countless vulnerabilities.

So, we have a complex, bloated, and often outdated foundation for our most valuable intellectual property. What could possibly go wrong?

Meet Your New Security Team: Trivy and Clair

You can’t manually check every package in your 4GB Docker image against a list of all known vulnerabilities. That’s an impossible task. This is where vulnerability scanners come in. They automate this process, acting as your tireless, eagle-eyed security analysts.

A vulnerability scanner works on a simple principle:

  1. It inspects your container image, layer by layer.
  2. It builds a list of every single piece of software installed: the operating system (e.g., Debian), the system packages (e.g., openssl, zlib), and the application libraries (e.g., Python packages from pip, Node.js modules from npm).
  3. It compares this list of software and their specific versions against a massive, constantly updated database of known vulnerabilities. These are typically tracked as CVEs (Common Vulnerabilities and Exposures).
  4. It generates a report, telling you exactly what’s vulnerable, how severe the vulnerability is, and often, how to fix it.

We’re going to focus on two of the most popular open-source scanners in the container world: Trivy and Clair.

Trivy: The SWAT Team

Trivy, developed by Aqua Security, is built for speed and simplicity. It’s a single, standalone binary that you can run anywhere. It’s incredibly fast and designed to be dropped directly into a CI/CD pipeline.

Think of Trivy as a SWAT team. It’s a self-contained unit that you call in for a specific mission. It kicks down the door of your container image, rapidly identifies all the threats, gives you a clear, actionable report, and then it’s gone. It doesn’t require a complex server setup or a persistent database. It just works.

Clair: The Central Intelligence Agency

Clair, originally from CoreOS and now a Quay project under Red Hat, is a different beast. It’s an API-driven, server-based scanner. It has a more architectural feel. You run a central Clair server that is responsible for ingesting vulnerability data from various sources and keeping its database up-to-date. Other tools (clients, like clair-scanner, or a container registry) then send a list of an image’s contents to the Clair API, and Clair responds with a vulnerability report.

Think of Clair as a central intelligence agency like the CIA or MI6. It doesn’t go on every raid itself. Instead, it maintains a massive, consolidated intelligence database. Field agents (the clients) send it information about a target, and the agency provides the crucial threat intelligence. This model is powerful for large organizations that want a single, authoritative source of vulnerability information for all their container images.

Trivy: Standalone Scanner Developer Laptop or CI/CD Runner Trivy CLI Docker Image scans Report Clair: Client-Server Architecture Clair Client (e.g. clair-scanner) Clair Server API Vuln DB 1. Send Manifest 2. Get Report

Which one is better? That’s the wrong question. The right question is, which one is better for your specific workflow? Let’s get our hands dirty and find out.

A Practical Deep Dive: Scanning a Vulnerable ML App

Theory is nice, but seeing is believing. We’re going to build a deliberately vulnerable Docker image for a simple ML application and then unleash our scanners on it.

Step 1: The “Victim” Dockerfile

Our application will be a simple Flask web server that uses an old version of the Pillow library to process images. We’ll build it on an outdated base image to make sure we have plenty of OS-level vulnerabilities to find.

Here’s our Dockerfile:


# Use an older, known-vulnerable base image. Debian Buster is EOL.
FROM python:3.8-slim-buster

WORKDIR /app

# Copy requirements file
COPY requirements.txt .

# Install old, vulnerable Python packages
RUN pip install --no-cache-dir -r requirements.txt

# Copy the application code
COPY . .

# Expose the port the app runs on
EXPOSE 5000

# Command to run the app
CMD ["python", "app.py"]

And our requirements.txt:


Flask==1.1.2
# Pillow 8.3.1 has a known RCE vulnerability (CVE-2022-22817)
Pillow==8.3.1

Finally, a dead-simple app.py:


from flask import Flask
from PIL import Image

app = Flask(__name__)

@app.route('/')
def hello():
    # This is just to demonstrate the library is used
    try:
        img = Image.new('RGB', (60, 30), color = 'red')
        img.save('test.png')
        return "Hello from the ML App! Image processed with Pillow."
    except Exception as e:
        return str(e)

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

Let’s break down why this is a terrible, insecure setup (and therefore perfect for our demonstration):

  • python:3.8-slim-buster: We’re using a base image based on Debian 10 “Buster”. Its long-term support ended in mid-2024. This means it’s no longer receiving security updates. It’s a sitting duck.
  • Pillow==8.3.1: We are pinning a specific, old version of the Pillow library. A quick search for this version reveals multiple known vulnerabilities, including some critical ones.
  • Running as root: By default, everything in this container will run as the root user, which is a major security anti-pattern.

Let’s build this image. Save the files and run:

docker build -t vulnerable-ml-app:latest .

Step 2: Unleashing Trivy

First, install Trivy. On macOS, it’s as simple as brew install trivy. For other systems, check their official documentation. It’s usually a one-liner.

Now, for the main event. Let’s scan our image:

trivy image vulnerable-ml-app:latest

The first time you run it, Trivy will download its vulnerability database. Subsequent scans will be lightning-fast. The output will be a long, colorful table. It’s going to be overwhelming, but let’s break it down.


...
2024-07-23T10:30:00.123Z    INFO    Detected OS: debian 10
2024-07-23T10:30:00.123Z    INFO    Detecting Debian vulnerabilities...
...

debian:10 (debian 10.13)
========================
Total: 159 (UNKNOWN: 0, LOW: 85, MEDIUM: 44, HIGH: 28, CRITICAL: 2)

┌────────────────┬──────────────────┬──────────┬───────────────────┬───────────────┬──────────────────────────────────────────────────────────┐
│    Library     │  Vulnerability   │ Severity │ Installed Version │ Fixed Version │                          Title                           │
├────────────────┼──────────────────┼──────────┼───────────────────┼───────────────┼──────────────────────────────────────────────────────────┤
│ apt            │ CVE-2020-27350   │ HIGH     │ 1.8.2.3           │               │ apt: integer overflows and underflows while parsing      │
│                │                  │          │                   │               │ .deb packages                                            │
├────────────────┼──────────────────┼──────────┼───────────────────┼───────────────┼──────────────────────────────────────────────────────────┤
│ e2fsprogs      │ CVE-2022-1304    │ CRITICAL │ 1.44.5-1+deb10u1  │ 1.44.5-1+deb10u2│ e2fsprogs: out-of-bounds write in                              │
│                │                  │          │                   │               │ libext2fs library                                        │
├────────────────┼──────────────────┼──────────┼───────────────────┼───────────────┼──────────────────────────────────────────────────────────┤
│ gnutls28       │ CVE-2024-28834   │ HIGH     │ 3.6.7-4+deb10u10  │ 3.6.7-4+deb10u11│ gnutls: timing side-channel in ECDSA signature generation│
...
(and many more)
...

Python
========
Total: 3 (UNKNOWN: 0, LOW: 0, MEDIUM: 2, HIGH: 0, CRITICAL: 1)

┌─────────┬──────────────────┬──────────┬───────────────────┬───────────────┬──────────────────────────────────────────────────────────┐
│ Library │  Vulnerability   │ Severity │ Installed Version │ Fixed Version │                          Title                           │
├─────────┼──────────────────┼──────────┼───────────────────┼───────────────┼──────────────────────────────────────────────────────────┤
│ pillow  │ CVE-2022-22817   │ CRITICAL │ 8.3.1             │ 9.0.0         │ Pillow: buffer overflow in PsdImagePlugin.py             │
│         │                  │          │                   │               │                                                          │
└─────────┴──────────────────┴──────────┴───────────────────┴───────────────┴──────────────────────────────────────────────────────────┘

Wow. That’s a lot to take in.

The output is grouped by target (the OS packages for “debian:10” and our application packages for “Python”). For each vulnerability, we get:

  • Library: The name of the vulnerable package (e.g., e2fsprogs, pillow).
  • Vulnerability: The CVE identifier. You can google this ID for extremely detailed information about the vulnerability.
  • Severity: How bad is it? Trivy uses CRITICAL, HIGH, MEDIUM, LOW.
  • Installed Version: The version you have.
  • Fixed Version: The version you need to upgrade to to fix the issue. This is pure gold!
  • Title: A brief, human-readable summary of the vulnerability.

We’ve immediately confirmed our suspicions. The old Debian base image has 2 CRITICAL and 28 HIGH severity vulnerabilities. And our pinned Pillow version has a CRITICAL vulnerability, just as we expected.

Filtering the Noise

A list of 159 vulnerabilities is not actionable. You need to focus. Let’s re-run the scan, but only show us the problems that demand immediate attention.

trivy image --severity CRITICAL,HIGH vulnerable-ml-app:latest

This command filters the output to only show CRITICAL and HIGH issues. The list is now much shorter and more manageable. This is the list you should be tackling first.

Integrating Trivy into CI/CD

This is where Trivy truly shines. You can make it a gatekeeper in your build pipeline. If a developer tries to merge code that introduces a critical vulnerability, the build fails. It’s a powerful way to enforce security policy.

Here’s a snippet for a GitHub Actions workflow:


- name: Build Docker image
  run: docker build -t vulnerable-ml-app:${{ github.sha }} .

- name: Scan image with Trivy
  run: |
    trivy image --exit-code 1 --severity CRITICAL vulnerable-ml-app:${{ github.sha }}

The magic is in --exit-code 1. This tells Trivy to exit with a non-zero status code if it finds any vulnerabilities matching the criteria (in this case, CRITICAL severity). A non-zero exit code will cause the CI/CD pipeline step to fail, blocking the merge or deployment.

You’ve just created an automated security guard for your container registry.

Trivy as a CI/CD Gatekeeper Code Push Build Image Trivy Scan (Gate) PASS Push to Registry FAIL (Exit Code 1) Pipeline Fails

Step 3: Setting Up and Using Clair

As we discussed, Clair is more of a central service. Setting it up locally requires a bit more effort than a single binary. You’ll typically run it and its PostgreSQL database using Docker Compose.

The setup can be complex, so we won’t detail every step here (the official Quay documentation is the best source). But a typical local setup involves a docker-compose.yml that starts a few containers: Clair itself, a PostgreSQL database for it to use, and often a tool like clair-scanner to act as the client.

Once it’s running, the workflow looks a bit different:

  1. You need a container registry that Clair can access. For local testing, you can run a local Docker registry: docker run -d -p 5000:5000 --name registry registry:2
  2. Tag and push your vulnerable image to this local registry:
    docker tag vulnerable-ml-app:latest localhost:5000/vulnerable-ml-app
    docker push localhost:5000/vulnerable-ml-app
  3. Run the scanner client, pointing it at your image and the Clair server. A tool like clair-scanner would be used like this:
    clair-scanner --ip $(hostname -i) localhost:5000/vulnerable-ml-app

The output from Clair is often in JSON format, designed for machine consumption, though clients like clair-scanner can present it in a more human-friendly way. It will contain the same core information as Trivy’s report: the vulnerable package, the CVE, the severity, and a link to more details. The results should be very similar, though they might differ slightly based on when each tool’s vulnerability database was last updated.

Trivy vs. Clair: A Practical Showdown

So, which one should you use? It depends entirely on your needs.

Feature Trivy Clair
Ease of Use Exceptional. Single binary, no server setup. Perfect for developers and quick scans. Moderate. Requires a client-server setup with a database. More of an infrastructure component.
CI/CD Friendliness Excellent. Designed from the ground up for easy pipeline integration. Fast and simple. Good. Can be integrated, but requires the CI runner to communicate with the Clair server.
Architecture Standalone CLI tool. Downloads its own database. Client-server. Centralized, shared vulnerability database.
Primary Use Case Developer machines, CI/CD pipelines, ad-hoc scanning. Integration with container registries (like Quay), centralized scanning for large organizations.
Output Human-readable tables, JSON, SARIF, and more. Very flexible. Primarily API-driven (JSON). Relies on clients for user-friendly formatting.

Golden Nugget: Start with Trivy. It gives you 90% of the value with 10% of the effort. If and when your organization grows to need a centralized, registry-integrated scanning solution, then it’s time to evaluate Clair.

Beyond the Scan: A Red Teamer’s Remediation Playbook

Great. You’ve run a scan and you have a report with hundreds of vulnerabilities. Your first reaction might be panic. Your second might be to ignore it because it’s too much work.

Don’t do either. A vulnerability report is not a judgment. It’s a to-do list.

The goal is not to reach zero vulnerabilities. That’s often impossible. The goal is to manage risk intelligently.

Prioritization is Everything

Not all CRITICAL vulnerabilities are created equal. You need to consider the context. A critical RCE vulnerability in a library used by your public-facing web server is a “drop everything and fix this now” problem. A critical vulnerability in a build-time tool that never makes it into the final container is a much lower priority.

Ask yourself these questions to prioritize:

  • Is the vulnerable component exposed to the internet? A flaw in Flask or Nginx is more dangerous than one in an offline data processing script.
  • Is there a known public exploit? A vulnerability that’s being actively exploited in the wild (check sources like the CISA Known Exploited Vulnerabilities Catalog) is infinitely more dangerous than a theoretical one.
  • How complex is the exploit? Does it require the attacker to already have a foothold, or can it be triggered by a single malicious web request?
  • What is the impact? Does it lead to Remote Code Execution (catastrophic), data leakage (very bad), or just a Denial of Service (bad, but maybe less critical)?

The Remediation Dance: A Step-by-Step Guide

For our vulnerable-ml-app, here’s how we’d tackle the report.

1. Update the Base Image (The Biggest Win)

The vast majority of our OS-level vulnerabilities came from using debian:buster. The single most effective change we can make is to switch to a modern, supported base image.

Change this line in the Dockerfile:

FROM python:3.8-slim-buster

To this:

FROM python:3.11-slim-bookworm

This moves us to Debian 12 “Bookworm”, a much newer and actively supported release. Rebuild the image and scan it again. You’ll see that dozens of OS vulnerabilities have vanished instantly.

2. Update Application Packages

Next, tackle the application-level issues. Trivy told us Pillow==8.3.1 had a critical flaw and the fix was in version 9.0.0. We simply update our requirements.txt:

Pillow==9.0.0

Or even better, if your application isn’t sensitive to minor version changes, use a more flexible requirement and regularly update your dependencies:

Pillow>=9.0.0

3. When There’s No Easy Fix

Sometimes you’ll be stuck. Maybe the fixed version of a library breaks your code, and you don’t have time to refactor. Or maybe a vulnerability is so new that no fix exists yet. What then?

This is where mitigation comes in. You can’t remove the vulnerability, but you can make it harder to exploit. This includes runtime security measures like:

  • Dropping capabilities: Does your container really need the ability to change network settings or access raw sockets? Probably not. Run it with --cap-drop=ALL.
  • Applying Seccomp/AppArmor profiles: These are advanced Linux security features that restrict the system calls a process can make, drastically limiting what an attacker can do even if they achieve code execution.
  • Using a read-only root filesystem: Run the container with the --read-only flag to prevent an attacker from writing new files or modifying existing ones.

The Golden Rules of ML Container Hygiene

The best way to fix vulnerabilities is to not introduce them in the first place. Adopting these habits will make your containers dramatically more secure from the start.

1. Use Minimal Base Images

Don’t start with ubuntu:latest. Start with the smallest possible image that can run your application. This reduces your attack surface.

  • -slim variants: Images like python:3.11-slim are a great starting point. They have the OS basics but leave out a lot of development cruft.
  • Alpine Linux: Images based on Alpine (e.g., python:3.11-alpine) are tiny, but they use musl libc instead of the more common glibc, which can sometimes cause compatibility issues with pre-compiled Python wheels. Test carefully!
  • Distroless: Google’s “distroless” images are the ultimate in minimalism. They contain only your application and its runtime dependencies. No shell, no package manager, no utilities. An attacker who gets RCE finds themselves in an empty room with no tools.

2. Master Multi-Stage Builds

This is one of the most powerful features in modern Dockerfiles. A multi-stage build lets you use a large, tool-filled image to build your application, and then copy only the necessary artifacts into a tiny, clean production image.

Here’s how you could apply it to a Python app:


# --- Build Stage ---
# Use a full-featured image to install dependencies
FROM python:3.11 as builder

WORKDIR /app

# Create a virtual environment
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

# Install dependencies into the venv
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# --- Final Stage ---
# Use a minimal distroless image for the final product
FROM gcr.io/distroless/python3-debian12

# Copy the virtual environment from the builder stage
COPY --from=builder /opt/venv /opt/venv

# Copy the application code
WORKDIR /app
COPY . .

# Set the path to use the venv
ENV PATH="/opt/venv/bin:$PATH"

EXPOSE 5000
CMD ["python", "app.py"]

The final image contains Python and our virtual environment, but none of the compilers, headers, or build tools from the builder stage. It’s lean and mean.

The Power of Multi-Stage Builds Stage 1: “builder” Base Image (e.g., python:3.11) Compilers, Build Tools, Headers Compiled App & Dependencies COPY –from=builder Stage 2: Final Image Minimal Base (e.g., distroless) Compiled App & Dependencies

3. Run as a Non-Root User

This is so easy, there’s no excuse not to do it. Add these lines to the end of your Dockerfile:


RUN groupadd --gid 1001 nonroot && \
    useradd --uid 1001 --gid 1001 -m nonroot
USER nonroot

This creates a new user and switches to it. If an attacker compromises your application, they are now running as a low-privilege user inside the container, not as root. Their ability to do further damage is massively constrained.

Conclusion

Your AI model might be the most brilliant piece of code ever written, but its genius is irrelevant if it’s served from a container that’s as secure as a screen door on a submarine. The “box” matters just as much as what’s inside it.

Container vulnerability scanning isn’t a silver bullet, but it’s the single most important first step you can take. It replaces ignorance and hope with data and a clear path forward. Tools like Trivy make it so easy to get started that there’s no longer any excuse for flying blind.

Integrate scanning into your pipeline today. Start with the most critical vulnerabilities and work your way down. Adopt better hygiene practices like minimal base images, multi-stage builds, and non-root users.

Stop treating your containers as an afterthought. Your model is valuable. Protect it. Are you scanning the box, or are you just waiting to find out who else is?