Secret Management in AI Applications: Protecting API Keys with Vault and Kubernetes Secrets

2025.10.17.
AI Security Blog

Your AI App’s Ticking Time Bomb: The API Key in Your Code

Let’s play a game. Go to GitHub. Type "sk-..." into the search bar, followed by a random string of characters. Filter by “Code”.

What do you see?

Kapcsolati űrlap - EN

Do you have a question about AI Security? Reach out to us here:

If you’re lucky, you’ll see a bunch of placeholder keys in example code. If you’re not—and you usually won’t be—you’ll see live, active, ready-to-be-abused OpenAI API keys. Keys that people have committed directly into public repositories.

Every single one of those is a company credit card, taped to a billboard, with a sign that says “FREE MONEY.”

This isn’t theoretical. I’ve seen the aftermath. The frantic Saturday morning call from a CTO who just got a $150,000 bill from their LLM provider because a developer pushed a “temporary” key for a demo app on Friday afternoon. The bots that scan GitHub for these patterns are faster than your CI/CD pipeline. They are relentless. By the time you get the commit notification email, your key is probably already powering a crypto-mining scheme or a massive content-spinning operation in a country you can’t point to on a map.

In this new gold rush of AI application development, we are moving so fast, building such incredible things, that we’re forgetting the absolute basics. We’re building skyscrapers on foundations of sand. And the most fragile part of that foundation is how we handle our secrets.

This isn’t just about OpenAI keys. It’s about your Pinecone credentials. Your Cohere key. Your cloud provider service account tokens. Your database passwords. These are the crown jewels of your AI infrastructure. And right now, there’s a good chance you’re leaving them lying around in the digital equivalent of a public park.

So, let’s talk about how to stop doing that. Let’s talk about building a fortress for your secrets. Not with buzzwords or magic solutions, but with real tools: Kubernetes and HashiCorp Vault. It’s time to go from amateur hour to professional grade.

The Anatomy of a Catastrophe

Before we get into the “how,” let’s really understand the “why.” I want to walk you through a depressingly common scenario. It’s a story that plays out, in some form or another, every single day.

It starts with a developer, let’s call her Alice. Alice is building a new RAG (Retrieval-Augmented Generation) feature. She needs to connect to a vector database and an LLM. She gets the API keys from her team lead.

Where does she put them? To get things working quickly on her laptop, she just creates a .env file. Simple.

OPENAI_API_KEY=sk-xxxxxxxxxxxxxxxxxxxx
PINECONE_API_KEY=yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy

The feature works! It’s amazing. Time to get it into a container and deploy it to the dev cluster. Alice writes a Dockerfile. To make it easy, she just copies the keys into environment variables in the Dockerfile itself. Or maybe she has a script that reads the .env file and passes them in with -e flags during the docker build.

Now the key is literally embedded in a layer of the container image. A permanent, fossilized record of the secret.

Next, she needs to share her work. She creates a new branch, commits her code… and accidentally includes the .env file. She didn’t have .env in her .gitignore. A tiny, one-line mistake.

The code gets pushed. The CI/CD pipeline kicks in, builds the container image with the embedded keys, and pushes it to a public container registry like Docker Hub because, hey, it’s just a dev project, right? Who cares if the image is public?

The disaster has now branched into two separate, equally lethal paths.

  1. The Git Path: An automated scanner, constantly watching GitHub for commits containing high-entropy strings that look like API keys, finds the .env file within seconds. It extracts the keys. The attacker now has direct, unfettered access to your LLM and vector DB accounts.
  2. The Container Path: Another scanner, this one focused on public container registries, pulls the new image. It runs tools that inspect image layers and configurations. It finds the API keys hardcoded in the environment variables of the image manifest. The attacker now has the same keys.

By the time Alice wakes up the next morning, thousands of dollars in fraudulent charges have been racked up. The vector database might have been wiped or, worse, subtly poisoned with bad data. The company’s reputation is on the line. And it all started with one “harmless” little file.

Dev Laptop (keys in .env) Public GitHub CI/CD Pipeline Public Container Registry Attacker Bot 1 Attacker Bot 2 git push Scans for keys Scans images

Think this is exaggerated? It’s not. The only unrealistic part of that story is that it was just one developer. In a real team, the number of opportunities for this mistake to happen multiplies with every person you add.

The Rogues’ Gallery of Bad Practices

Let’s put these mistakes under a microscope. As a red teamer, these are the bread and butter, the low-hanging fruit I look for first. They are practically an invitation for a breach.

Anti-Pattern Why It Feels “Easy” Why It’s a Disaster
Hardcoding in Source Code It’s right there! No need to configure anything. Just client = OpenAI("sk-...") and you’re done. The key is now part of your company’s intellectual property, forever etched in your Git history. Even if you remove it, it’s still in the history. It can be found by anyone who gets a copy of your repo. Decompiling your application binary can also reveal it.
Plaintext in Config Files Seems more organized than hardcoding. You can have different configs for dev, staging, and prod. These files (config.json, settings.yaml) are often committed to Git by mistake. If an attacker gets file-read access to your server, it’s game over. There’s zero protection.
Using Environment Variables This is the “12-Factor App” way, right? It separates config from code. It feels professional. Better, but far from safe. Anyone with access to the machine can run env or inspect the process’s environment. In Kubernetes, anyone with get pods and exec permissions can see them. They often get logged by mistake in crash reports or debugging outputs. And they are frequently baked into container images.
Baking into Container Images The image becomes a self-contained, “ready-to-run” unit. Super convenient for deployment. This is one of the worst. An image is a collection of tarballs. It’s trivial to pull an image and run docker history or use tools like dive to inspect every layer and find the exact command where the secret was added. It’s a permanent vulnerability.

Golden Nugget: A secret that is written to disk, committed to Git, or baked into an artifact is no longer a secret. It’s a liability waiting to be discovered. The goal is not to hide secrets better, but to never have to handle them in the first place.

A New Mental Model: Secrets as Radioactive Waste

Stop thinking of secrets as simple strings you pass around. Start thinking of them as small, intensely radioactive rods.

Would you email a radioactive rod to a coworker? Would you check it into source control? Would you leave it sitting on a public web server?

No. You’d build a special, lead-lined container. You’d grant access on a need-to-know basis. You’d handle it with specialized equipment, for the shortest possible time, and then you’d put it right back in its container. You’d have a log of every single time it was accessed.

This is the mindset we need. Our job as engineers is to build that lead-lined container for our applications. The secret should only exist, unencrypted, in the memory of the application process for the brief moment it’s needed to make an API call. That’s it.

This is where dedicated secret management tools come in. They are the lead-lined containers. Today, we’re focusing on two key players in the cloud-native world: Kubernetes Secrets and HashiCorp Vault.

The Tools of the Trade: Kubernetes Secrets vs. Vault

If you’re running your AI applications on Kubernetes (and you probably should be), you have a built-in option for handling secrets. It’s called, unsurprisingly, a “Secret.”

Level 1: Kubernetes Secrets

A Kubernetes Secret is a first-class object in the K8s API, just like a Pod or a Service. Its purpose is to hold a small amount of sensitive data. You can create a secret and then mount it into your pod as either an environment variable or a file in a volume.

Sounds perfect, right? Well, there’s a catch. A huge one.

By default, Kubernetes Secrets are just Base64 encoded. Let me say that again.

Base64 is not encryption. It is encoding.

It’s a way of representing binary data in text. It’s like writing a message in pig latin. Anyone who knows the “secret” (how Base64 works) can instantly decode it. It offers zero confidentiality. It only prevents special characters from messing up your terminal.

$ echo "sk-my-secret-key" | base64
c2stbXktc2VjcmV0LWtleQo=

$ echo "c2stbXktc2VjcmV0LWtleQo=" | base64 --decode
sk-my-secret-key

The real security of Kubernetes Secrets hinges on two things: Kubernetes RBAC (Role-Based Access Control) and the encryption of the underlying etcd database where Kubernetes stores all its state.

  • RBAC: You can control who (which users or service accounts) can get, list, or watch secrets. This is crucial. If you can’t read the Secret object, you can’t decode its contents.
  • Encryption at Rest: The Secret object is stored in etcd. You can (and absolutely must) configure your Kubernetes cluster to encrypt the etcd data at rest. If an attacker gets a backup of your etcd database, this is your last line of defense. Many managed cloud Kubernetes offerings (EKS, GKE, AKS) enable this by default, but you should always verify.
Kubernetes Cluster AI App Pod K8s Secret Object data: { key: “c2st…” } etcd ? mounts stored in The secret is only Base64 encoded in the API. Security depends entirely on etcd encryption at rest.

So, what’s the verdict on K8s Secrets? They are:

  • Good for: Simple use cases, secrets that don’t change often, and when you have a robust RBAC and encrypted etcd setup.
  • Bad because:
    • They are static. The secret lives forever until you manually change it. There’s no concept of a lease or time-to-live (TTL). A leaked static secret is a permanent backdoor.
    • No centralized auditing. You can audit K8s API access, but it’s not a dedicated, fine-grained audit log for secrets. Who read the secret? When? Why? It’s harder to answer.
    • Management can be clumsy. Distributing and rotating secrets across many clusters and applications requires a lot of manual work or custom tooling.

K8s Secrets are a foundational piece, but they are not the complete fortress. They are more like a locked file cabinet in an office. If you’re already inside the office, and you have the key to the cabinet, the contents are yours.

Level 99: HashiCorp Vault

If Kubernetes Secrets are a locked file cabinet, Vault is a Swiss bank. It’s a purpose-built tool designed from the ground up for one thing: managing secrets.

Vault’s philosophy is completely different. It operates on the principle of dynamic secrets. The idea is that your application should never touch a long-lived, static credential. Instead, when your app starts, it authenticates to Vault, proves who it is, and Vault generates a brand-new, short-lived credential just for that instance of the app. That credential might only be valid for an hour. If it leaks, the window of opportunity for an attacker is tiny.

This is a game-changer.

Vault’s core components are:

  • Storage Backend: This is where Vault stores its data, but everything written to the storage backend is heavily encrypted. Vault itself doesn’t have the decryption key; it’s protected by a master key that is split among several people (Shamir’s Secret Sharing). This is the “unsealing” process.
  • Auth Methods: This is how your applications prove their identity to Vault. It’s not about usernames and passwords. It’s about trusted identities. For an app in Kubernetes, the best method is the Kubernetes Auth Method, where the app uses its native K8s Service Account Token to authenticate. Vault then validates this token with the K8s API.
  • Secret Engines: This is where the magic happens. Instead of just storing key-value pairs (which it can do), Vault has engines that can dynamically generate secrets on the fly. It can create temporary AWS IAM users, database credentials, or PKI certificates.
  • Audit Trail: Every single action in Vault is logged in a detailed, tamper-proof audit trail. Every authentication attempt, every secret read. This is a goldmine for security teams.
VAULT HTTP API Core Logic / ACL Encrypted Storage Audit Log Who are you? K8s Auth AppRole User/Pass What do you need? KV Store Database AWS/GCP

The downside? Vault is another piece of critical infrastructure you have to run, manage, and secure. It’s not trivial. But the security payoff is immense.

The Golden Pattern: Vault as the Brain, Kubernetes as the Hands

So, it’s not really a “versus” situation. The most powerful, professional setup uses both tools together, playing to their strengths.

  • Vault acts as the central, highly-secure “source of truth” for all secrets across your entire organization.
  • Kubernetes provides the runtime environment and the identity mechanism for your applications.

The magic that connects them is the Vault Agent Injector.

This is a component you run in your Kubernetes cluster. It watches for new pods being created. If a pod has specific annotations in its manifest, the injector automatically does two things:

  1. It adds a small, lightweight Vault Agent container to your pod (a “sidecar”).
  2. It creates a shared, in-memory volume (a tmpfs) that both the Vault Agent and your main application container can access.

Here’s the flow. It’s beautiful.

  1. Your AI app’s pod is scheduled to start.
  2. The Vault Agent Injector sees the annotations and adds the Vault Agent sidecar.
  3. The pod starts. The Vault Agent sidecar starts first.
  4. The Vault Agent reads the pod’s own Service Account Token, which Kubernetes automatically mounts into every pod. This token is a short-lived, signed JWT that proves the pod’s identity (e.g., “I am the pod named ‘rag-app-7f8c9d’ in the ‘ai-apps’ namespace”).
  5. The agent presents this token to Vault’s Kubernetes Auth endpoint.
  6. Vault receives the token, and to verify it’s not a fake, it makes a call back to the Kubernetes API server. “Hey K8s, is this a valid token for this pod?”
  7. Kubernetes confirms the token’s validity.
  8. Vault checks its policies. “Ah, the service account for ‘rag-app’ is allowed to read the OpenAI key.”
  9. Vault issues a short-lived Vault token back to the Vault Agent sidecar.
  10. The sidecar, now authenticated, requests the actual OpenAI key from Vault’s Key-Value store.
  11. Vault returns the key.
  12. The Vault Agent sidecar writes the key to a file in the shared in-memory volume, for example, at /vault/secrets/openai-key.
  13. Now, your main AI application container starts up. It doesn’t need any special logic. It just reads its API key from a local file at /vault/secrets/openai-key.

The best part? The Vault Agent keeps the secret fresh. If the secret has a 1-hour lease, the agent will automatically go back to Vault before it expires and get a new one, updating the file on the fly. Your application doesn’t even need to know it’s happening.

Golden Nugget: Your application code becomes blissfully ignorant. It has no idea what Vault is. It doesn’t handle tokens or authentication. All it knows is that its required API key will be waiting for it in a file at a predictable location. This is the ultimate separation of concerns.

Kubernetes Cluster HashiCorp Vault K8s API Server Vault Agent Injector (Webhook) AI App Pod AI App Container Vault Agent Sidecar (Injected) Shared Memory Volume (/vault/secrets) 1. Pod Creation Request 2. Injector adds sidecar 3. Authenticate with Pod SA Token 4. “Is this token valid?” 5. “Yes, it is.” 6. Return short-lived secret 7. Writes to shared volume 8. App reads from file

Now You’re Thinking with Portals: Advanced Threats

So you’ve set up Vault and the agent injector. You’re feeling pretty good. Your API key isn’t in Git, it’s not in the container image, it’s not even in an environment variable. You’re done, right?

Not so fast. As a red teamer, my job is to assume you’ve done the basics right and find the next weakest link. The attack surface has shrunk, but it hasn’t disappeared.

Attack Vector 1: The Noisy Neighbor (Sidecar Compromise)

Your pod now contains at least two containers: your app and the Vault agent. What if your main AI app has a vulnerability—say, a remote code execution (RCE) flaw in the web framework it uses? An attacker gets a shell inside your AI app container.

What’s the first thing they’ll do? Look around. They’ll see the mounted volume at /vault/secrets. They’ll read the key. Game over.

But what if you have a third container in that pod? Maybe a logging sidecar, or a service mesh proxy like Istio’s. If an attacker compromises that container, and it’s part of the same pod, it has access to the same shared volume. It can also steal the key meant for your AI app.

The Mitigation: The Principle of Least Privilege, applied to pods. Keep your pods as small and single-purpose as possible. Don’t throw unrelated containers into the same pod just for convenience. Every container you add is another potential attack vector against the shared secrets.

Attack Vector 2: The First Secret Problem

Our entire beautiful workflow relies on one “first secret”: the pod’s Service Account Token (SAT). That token is the key that unlocks Vault. Therefore, the security of that token, and the permissions granted to that Service Account, are paramount.

If an attacker can compromise the Kubernetes API or has overly permissive RBAC roles, they could potentially create a pod with the same Service Account as your AI app. That malicious pod would get its own valid token, authenticate to Vault as your AI app, and steal its secrets.

The Mitigation: Harden your Kubernetes RBAC. This is non-negotiable.

  • Don’t use the default service account. Create specific, named Service Accounts for each application.
  • Apply the Principle of Least Privilege to your RBAC roles. The role bound to your AI app’s service account should have the absolute minimum permissions it needs to function. It probably doesn’t need to list secrets across the whole cluster, for example.
  • Audit your RBAC policies regularly. Use tools like kubectl-who-can to understand what your service accounts are actually allowed to do.

Attack Vector 3: Audit Log Blindness

Vault produces one of the best audit trails in the business. It logs every request and every response (except for the secret values themselves). This is an incredible detection tool.

But it’s useless if no one is looking at it.

An attacker might try to authenticate to Vault from an unexpected place. They might try to access a secret they aren’t allowed to. They might successfully authenticate but then try to read ten different secrets when your app only ever needs one. All of these actions will be logged.

The Mitigation: Ship your Vault audit logs to a proper SIEM (Security Information and Event Management) system. Set up alerts for suspicious activity:

  • Authentication failures from a known service account.
  • Access to secrets outside of normal business hours.
  • A single entity reading an unusually large number of secrets.
  • Denials based on policy.

Your audit log is your tripwire. You have to make sure it’s plugged in.

It’s Time to Clean House

We’ve covered a lot of ground, from catastrophic failures to a robust, professional-grade solution. The core message is simple: secrets are toxic. Your goal should be to minimize their lifespan and the number of systems that ever see them in plaintext.

Hardcoding secrets is like storing gasoline next to an open flame. Using basic Kubernetes Secrets is an improvement, like putting the gasoline in a sealed can, but it’s still in the same room. Using Vault with the agent injector is like moving the gasoline to a fireproof bunker, and only piping in the vapor you need for the split second your engine is running.

Is setting up Vault and Kubernetes integration more work than just pasting a key into your code? Yes. Unquestionably.

But it’s the difference between being a hobbyist and being an engineer. It’s the difference between a ticking time bomb and a resilient, secure system. The AI world is moving at a breakneck pace, but that’s no excuse for sloppy security. The biggest threat to your amazing new AI application isn’t a rival model or a market shift; it’s a single line of code, a single leaked key that brings the whole thing crashing down.

So, I’ll ask you one final question. A slightly uncomfortable one.

Do you know where all your keys are, right now?