Confidential Computing for AI: Protecting Data in Use with Intel SGX and AMD SEV

2025.10.17.
AI Security Blog

Confidential Computing for AI: Your Last Line of Defense When Everything Else Fails

You’ve done everything right. Your S3 buckets are locked down with iron-clad IAM policies. Your databases are encrypted at rest with AES-256. Your network traffic is wrapped in TLS 1.3. You’re a security champion. You sleep well at night.

Now, let me ask you a question that might ruin that sleep. What happens the moment your multi-million-dollar AI model loads that encrypted patient data from the database into memory to make a prediction?

Kapcsolati űrlap - EN

Do you have a question about AI Security? Reach out to us here:

What happens when your proprietary algorithm, the very heart of your company’s IP, is sitting there, unencrypted, in the server’s RAM?

In that moment, all your beautiful encryption-at-rest and encryption-in-transit becomes a historical artifact. The data is live. It’s in use. It’s naked. And if an attacker gets root on that box—or worse, if the cloud provider itself has a rogue employee or is compelled by a government—they can dump the memory of your process and walk away with everything. The model, the data, the whole kingdom.

This isn’t a theoretical threat. This is the dirty secret of cloud computing. We protect the data when it’s sitting still and when it’s moving, but we largely ignore its most vulnerable state: when it’s actually being processed.

This is where Confidential Computing comes in. It’s not another layer of software security. It’s a fundamental shift, enforced by silicon. Today, we’re diving deep into the tech that makes it possible, specifically for AI workloads: Intel SGX and AMD SEV.

The Three States of Data: The Gap in Our Armor

In security, we talk about the three states of data. It’s a simple model, but it’s crucial.

  1. Data at Rest: This is your data sitting on a disk, in a database, or in object storage. We solve this with file system or database encryption (like TDE, BitLocker, etc.). It’s a locked safe.
  2. Data in Transit: This is data moving across a network. We solve this with protocols like TLS and SSH. It’s an armored truck moving the safe.
  3. Data in Use: This is data loaded into memory (RAM) and the CPU for active processing. For decades, the answer here has been… well, mostly hopes and prayers.

Think of it like a master jeweler. The gold is locked in a vault (at rest). It’s transported in an armored car (in transit). But to craft a masterpiece, the jeweler must take the gold out, place it on the workbench, and shape it. At that moment, on the workbench, it’s completely exposed. Anyone who can get into the workshop can just grab it.

In our world, the cloud hypervisor, the host OS, and any privileged administrator are all inside that workshop.

DATA AT REST On Disk / Database Protected by: Disk Encryption (TDE, etc.) SECURE Network DATA IN TRANSIT Moving over Network Protected by: TLS / SSH SECURE Processing DATA IN USE In RAM / CPU Registers Protected by: …OS permissions? VULNERABLE

For AI, this is a catastrophic failure. Why? Because the “gold” isn’t just the data; it’s also the jeweler’s secret techniques—the AI model itself. Both the sensitive training/inference data and the priceless model weights are sitting in RAM, clear as day, for any sufficiently privileged process to inspect.

Confidential Computing: The Vaulted Workshop

Confidential Computing aims to solve this “data in use” problem by creating a hardware-enforced isolated environment called a Trusted Execution Environment (TEE). The goal is to protect data even from the cloud provider, the host OS kernel, the hypervisor, and any other process on the system.

Forget the jeweler’s open workshop. A TEE is more like a sealed, automated, vaulted workshop delivered by a trusted manufacturer.

  • You can’t see inside it.
  • You can’t tamper with what’s happening inside.
  • You can pass sealed containers of raw materials (encrypted data) into a specific port.
  • The workshop does its job.
  • A finished product (the inference result) comes out of another port.

Crucially, you can verify the integrity of this workshop from afar. You can get a cryptographically signed guarantee from the manufacturer (the CPU vendor) that this is a genuine, untampered workshop and that it’s running the exact blueprint (your code) you authorized. This verification is called attestation, and it’s the absolute cornerstone of this technology.

Golden Nugget: Confidential Computing isn’t about trusting the cloud provider’s promises or their software stack. It’s about reducing your trust to a single, verifiable piece of hardware: the CPU itself. You are moving from a model of “operational trust” to “cryptographic trust.”

The two dominant technologies in this space come from the two giants of the CPU world: Intel’s SGX and AMD’s SEV. They both aim to create a TEE, but their philosophies and implementations are wildly different.

The Contenders: Intel SGX vs. AMD SEV

Choosing between SGX and SEV is like choosing between building a panic room in your house versus putting the entire house under a military-grade security dome. Both provide protection, but the approach, scope, and trade-offs are completely different.

Intel SGX: The Panic Room (Software Guard Extensions)

Intel’s approach is surgical. SGX allows an application to carve out a small, private region of its own memory space called an enclave. Code and data placed inside this enclave are automatically encrypted by the CPU. Any attempt to access that memory from outside the enclave—even by the kernel or a debugger running with root privileges—will be blocked by the hardware, and the data will be served up as encrypted garbage.

Think of your application as a large mansion. Most of it is open to the staff (the OS). But you build a small, vault-like panic room (the enclave) where you handle your most precious secrets. The staff can see the door to the panic room, they know it exists, but they have no way to get inside or see what’s happening in there. The application explicitly passes data in and out of this sealed room through a very controlled interface.

Host System (Cloud Provider) (Hypervisor / OS Kernel) Your Application Process Untrusted Part of App (e.g., Network I/O, Logging) Intel SGX Enclave Sensitive Code: model.predict(data) Sensitive Data: Patient X-Ray, Model Weights ENCRYPTED IN MEMORY ECALL/OCALL (Controlled Interface) Attacker (e.g., root) X ACCESS DENIED (by CPU)

How it works for AI:

You wouldn’t put your entire TensorFlow or PyTorch application inside an enclave. That would be incredibly inefficient and difficult. Instead, you isolate only the most critical part: the inference function.

  1. The main application handles networking, loading the encrypted model from disk, and receiving encrypted input data.
  2. It then makes a special function call (an “ECALL”) to enter the enclave.
  3. Inside the enclave, the model weights and input data are decrypted. The model.predict() function runs on the plaintext data.
  4. The result is produced, re-encrypted, and passed back out of the enclave to the untrusted part of the application, which then sends it back to the client.

Pros and Cons of SGX

  • Pro: Minimal Trusted Computing Base (TCB). This is SGX’s killer feature. Your TCB is just your specific enclave code and the Intel CPU. You don’t have to trust the application’s OS, the kernel, or the hypervisor. A smaller TCB means a smaller attack surface.
  • Pro: Strong Isolation. The process-level isolation is very granular and powerful. One compromised application on the server can’t affect the enclave of another.
  • Con: Invasive and Hard to Implement. You can’t just take an existing application and “run it in SGX.” It requires significant and careful refactoring to partition the code into trusted (enclave) and untrusted parts. This is a major engineering effort.
  • Con: Performance and Size Limitations. Historically, enclaves had tight memory limits (around ~128MB), though this is much improved in newer CPUs. There’s also a performance cost for entering and exiting the enclave (the ECALL/OCALL overhead). This makes it unsuitable for “chatty” applications that frequently cross the boundary.

AMD SEV: The Fortress (Secure Encrypted Virtualization)

AMD took a completely different, much broader approach. Instead of creating a tiny panic room inside an application, SEV aims to protect the entire virtual machine from the underlying hypervisor.

With SEV, the CPU encrypts all the memory used by a guest VM with a key that is inaccessible to the hypervisor. When the VM needs to access memory, the CPU decrypts it on the fly, performs the operation, and re-encrypts the result before writing it back to RAM. From the hypervisor’s perspective, the VM’s memory is just a blob of unintelligible ciphertext. It can manage the VM (start, stop, migrate), but it can’t inspect its contents.

This is the “security dome” analogy. You don’t worry about reinforcing individual rooms; you just drop a massive, impenetrable dome over the entire property. Your existing house (your VM) can continue to function exactly as it did before, unaware of the dome protecting it from the outside world (the hypervisor).

Host System / Hypervisor (e.g., KVM) (Untrusted Cloud Provider Infrastructure) Standard VM Guest OS Application Memory is visible to the hypervisor AMD SEV-SNP Protected VM 🔒 Guest OS (e.g., Ubuntu) Your Full AI Application (Python + TensorFlow + Gunicorn) ENTIRE VM MEMORY ENCRYPTED Hypervisor trying to inspect memory X ACCESS DENIED (Sees only ciphertext)

The Evolution of SEV:

SEV has gone through critical iterations:

  • SEV: The first version. Encrypted the memory, but the hypervisor could still replay or tamper with the encrypted memory pages, and CPU register state wasn’t fully protected.
  • SEV-ES (Encrypted State): Added encryption for the CPU register state, preventing the hypervisor from reading or modifying that sensitive information when a VM is interrupted.
  • SEV-SNP (Secure Nested Paging): This is the game-changer. It adds strong memory integrity protection. The hardware prevents the hypervisor from maliciously replaying, remapping, or modifying the ciphertext in memory. It provides a much stronger guarantee that what the VM wrote to memory is what it will read back. For any serious security workload, SNP is the baseline you should be looking for.

Pros and Cons of SEV

  • Pro: “Lift and Shift” Easy Adoption. This is SEV’s main selling point. You can take an existing VM image, boot it on an SEV-SNP capable host with the right flags, and it just works. No code modification is needed. This drastically lowers the barrier to entry.
  • Pro: Protects Everything. It protects the entire VM, including the guest OS, from the hypervisor. This is great for legacy applications or complex multi-process services that would be a nightmare to refactor for SGX.
  • Con: Larger Trusted Computing Base (TCB). The flip side of protecting everything is that you now have to trust everything inside the VM. Your TCB includes your application, all its dependencies, and the entire guest OS kernel. A vulnerability in the Linux kernel inside your VM could be used to compromise your application.
  • Con: Coarser-grained Isolation. The isolation boundary is the VM, not the process. It doesn’t protect applications inside the VM from each other or from a compromised OS within that same VM.

Head-to-Head Comparison

Let’s put them side-by-side. There’s no “better” technology, only what’s better for your specific threat model and engineering constraints.

Feature Intel SGX AMD SEV-SNP
Analogy The Panic Room The Fortress Dome
Isolation Unit Part of a process (Enclave) Entire Virtual Machine
Ease of Adoption Hard. Requires significant code refactoring. Easy. “Lift and shift” compatible with existing VMs.
Trusted Computing Base (TCB) Very Small: Your enclave code + CPU. Large: Your app + Guest OS + CPU.
Protects Against Compromised Host OS, Hypervisor, other processes. Compromised Hypervisor.
Doesn’t Protect Against Bugs within your own enclave code. Bugs in your app or the Guest OS kernel.
Best for AI… When you have a small, well-defined, ultra-sensitive computation (e.g., a single inference or key decryption) and can afford the engineering effort. When you need to run a complex, existing AI application (e.g., a full training environment or a multi-service inference stack) without modification.

The Secret Handshake: Why Attestation is Everything

So you’ve got a TEE running in the cloud. Great. How do you know it’s a real TEE and not some clever emulator run by an attacker? How do you know it’s running the correct, unmodified version of your AI model and not a version with a malicious backdoor?

This is where remote attestation comes in. It is, without a doubt, the most important and powerful feature of confidential computing.

It’s a cryptographic protocol that lets your TEE prove its identity and the integrity of its software to a remote party (the client). It’s the digital equivalent of a diplomatic courier showing you their special government-issued ID and a tamper-evident seal on their pouch before you hand over your classified documents.

Here’s a simplified breakdown of how it works:

  1. The Challenge: Your client application wants to send sensitive data to the AI service running in the TEE. Before it does, it challenges the TEE to prove itself.
  2. The Measurement: When the TEE (enclave or VM) is created, the CPU’s hardware measures the code and configuration that is loaded into it. This measurement is a cryptographic hash (e.g., a SHA-256 digest). It’s like a unique fingerprint of the software.
  3. The Report/Quote: The TEE asks the CPU to generate a signed report. This report contains the software measurement (the hash), a nonce provided by the client (to prevent replay attacks), and other important metadata.
  4. The Hardware Signature: The CPU signs this entire report using a special private key that is fused into the silicon during manufacturing. This key is unique to that specific CPU and is part of a chain of trust that leads back to Intel or AMD. This is the unforgeable ID card.
  5. The Verification: The client receives this signed report. It can’t verify the signature itself because it doesn’t have the CPU’s public key. So, it forwards the report to the hardware vendor’s Attestation Service (e.g., Intel’s DCAP or AMD’s SEV-SNP Attestation Server).
  6. The Verdict: The vendor’s service, which knows all the valid CPU keys, verifies the signature. It checks if the CPU is genuine, if its microcode is up to date, and if it has any known vulnerabilities. It then returns a “pass” or “fail” to the client.
  7. The Trust Decision: If verification passes, the client now knows two things with cryptographic certainty:
    • It is communicating with a genuine Intel/AMD CPU running a real TEE.
    • It knows the exact fingerprint (hash) of the software running inside that TEE.
  8. The Final Check: The client compares the hash from the report with the known-good hash of the software it expects to be running. If they match, trust is established. The client can now provision secrets (like data encryption keys or API tokens) to the TEE over a secure, encrypted channel.

This process is absolutely fundamental. Without it, you’re just blindly trusting that the cloud provider has set things up correctly. With attestation, you get proof.

1. Client / User 2. AI Service in TEE (SGX Enclave or SEV VM) 3. Hardware Vendor (Intel / AMD Attestation Service) 1. “Prove you are a real TEE running my code! Here’s a nonce.” 2. Ask CPU for signed quote 3. “Here is my signed quote. The signature is from the CPU itself.” 4. “Hey Intel/AMD, is this quote legit?” 5. “Yes, this is a genuine CPU and the signature is valid.” 6. “Trust established! The code hash matches. Here are my secrets.”

Putting It All Together: Real-World AI Scenarios

This all sounds great, but what does it actually enable? Let’s look at two concrete examples.

Scenario 1: Confidential AI Inference for Healthcare

  • The Problem: A hospital wants to use a cutting-edge AI model from a cloud startup to detect cancer in patient X-rays. The hospital’s data is extremely sensitive (HIPAA-protected), and they cannot legally or ethically send it to a third-party environment where it could be exposed. The startup’s AI model is their crown jewel IP and they will not deploy it on the hospital’s on-premise servers. It’s a classic standoff.
  • The Confidential Computing Solution:
    1. The startup deploys its inference service inside an AMD SEV-SNP VM or an Intel SGX enclave in a public cloud.
    2. The hospital’s on-premise application initiates a connection. It performs remote attestation, verifying that it’s talking to a genuine TEE and that the hash of the running code matches the known-good hash of the startup’s inference service.
    3. Once trust is established, the hospital’s app establishes a secure channel (e.g., TLS) directly with the code inside the TEE.
    4. It sends the encrypted patient X-ray. The data is decrypted only inside the TEE.
    5. The model runs inference on the plaintext data within the protected memory.
    6. The result (“cancer detected” or “negative”) is sent back over the secure channel.
  • The Outcome: The cloud provider never sees the patient data or the model. The hospital’s data never touches an untrusted environment. The startup’s model IP is never exposed. Both parties can collaborate without having to trust each other or the cloud provider.

Scenario 2: Secure Federated Learning for Finance

  • The Problem: Two competing banks want to train a superior fraud detection model. They know that by pooling their transaction data, they could achieve much higher accuracy. However, they are fierce rivals and are legally prohibited from sharing customer data with each other.
  • The Confidential Computing Solution:
    1. An agreed-upon “model aggregator” service is deployed inside an Intel SGX enclave. The code for this service is open-source and audited by both banks.
    2. Each bank trains the model on its own private data for one epoch. This produces a set of “model updates” or “gradients.”
    3. Each bank’s system connects to the aggregator enclave and performs remote attestation. They verify it’s the correct hardware and the correct, audited aggregator code.
    4. Each bank sends its encrypted model updates to the enclave.
    5. Inside the enclave, the updates are decrypted and averaged together to create a new, improved “global model.” The raw updates from each bank are immediately destroyed.
    6. The enclave sends the new global model back to both banks for the next round of local training.
  • The Outcome: A superior model is trained using data from both banks, but no raw transaction data ever leaves either bank’s perimeter. The model updates, which can sometimes be used to infer information about the training data, are only ever decrypted inside a secure, attested environment that no single party controls.

The “Gotchas”: This Isn’t a Magic Bullet

As a red teamer, my job is to be professionally paranoid. Confidential Computing is a massive leap forward, but it’s not without its own set of challenges and attack vectors. You need to be aware of them.

  1. Side-Channel Attacks: This is the boogeyman of TEEs. Even if an attacker can’t read the data in memory directly, they might be able to infer secrets by observing the side effects of the computation. Think of it as trying to figure out what’s being built inside a factory by just listening to the sounds, measuring power consumption, or timing how long deliveries take. Attacks like Spectre, Meltdown, and Foreshadow, as well as cache-timing and memory-access-pattern attacks, fall into this category. CPU vendors are constantly adding hardware mitigations, but it’s an ongoing arms race.
  2. Performance Overhead: Encrypting and decrypting every single memory access has a cost. While modern CPUs are incredibly fast at this, there is still a noticeable performance hit, especially for memory-intensive workloads. The overhead for SGX’s context switching (ECALLs/OCALLs) can also be significant if your application design is too “chatty.” You must benchmark your specific AI workload.
  3. Increased Complexity: This is not a “fire and forget” solution. Managing attestation, ensuring the TCB is clean, securely provisioning secrets, and debugging applications running inside a black box is complex. For SGX, the development and maintenance overhead of partitioning your app is substantial.
  4. The Trust in the Manufacturer: At the end of the day, you are placing your trust in the CPU hardware and the vendor (Intel or AMD). You trust that their CPU design is secure, their microcode is not backdoored, and their attestation services are run correctly. This is a much smaller and more auditable trust anchor than an entire cloud provider’s software stack, but it’s not zero.

So, Should You Bother?

Absolutely, yes.

For years, we’ve accepted the risk of processing unencrypted data in the cloud because there was no alternative. We relied on contracts, SLAs, and operational promises from cloud providers. Confidential Computing changes that.

It allows us to replace those promises with cryptographic proof. It provides a technical enforcement mechanism to create and verify trust in environments we don’t own or control.

It’s not perfect, and it’s not a replacement for good security hygiene. You still need to write secure code, manage your dependencies, and have a robust security posture. But for high-stakes AI workloads—where the data is regulated, the model is priceless, and the privacy of individuals is on the line—it provides a critical last line of defense.

The next time you’re designing a system that handles sensitive data, don’t just ask how you’ll protect it at rest and in transit. Ask the hard question: who can see my data, my code, my entire business logic, at the exact moment it’s running in memory?

If you don’t like the answer, it’s time to start building some walls. And for the first time, the CPUs themselves are handing you the bricks and mortar.