2.1.4 Edge vs. Cloud-Based AI Systems

2025.10.06.
AI Security Blog

A model’s deployment location is not a trivial detail; it is a fundamental architectural choice that dictates its entire attack surface. Before you can probe for vulnerabilities, you must answer the primary question: where does the inference happen? Is it on a remote, powerful server, or directly on the user’s device? The answer radically changes your red teaming strategy, tooling, and potential attack vectors.

After a model is trained and validated, as discussed in the model lifecycle, it must be deployed to be useful. This deployment broadly falls into two categories: cloud-based, where processing occurs on centralized servers, and edge-based, where processing occurs locally on the end-user device. Understanding the security trade-offs of each is critical for mapping vulnerabilities.

Kapcsolati űrlap - EN

Do you have a question about AI Security? Reach out to us here:

Cloud-Based AI: The Centralized Fortress

In a cloud-based architecture, the end-user device (like a mobile phone or web browser) acts as a thin client. It collects raw data (e.g., an image, a voice command, a text string) and sends it over a network to a powerful server managed by the service provider. The server hosts the AI model, performs the inference, and sends the result back to the client.

From a red teamer’s perspective, this architecture presents a classic network service target. The model itself is protected within the provider’s infrastructure, making direct access difficult. Your focus shifts to the perimeter and the communication channels.

Attack Surface of Cloud AI

  • API Endpoints: This is the primary gateway. Look for common web vulnerabilities: broken authentication, injection attacks (especially if the API pre-processes input insecurely), insecure direct object references, and rate-limiting flaws that could enable denial-of-service or expensive resource consumption attacks.
  • Network Traffic: Data is in transit. Is it encrypted with up-to-date protocols? Can you perform a Man-in-the-Middle (MITM) attack to intercept, view, or manipulate the data being sent to the model or the results coming back?
  • Cloud Infrastructure Misconfiguration: The model relies on a vast ecosystem of cloud services. Misconfigured S3 buckets, overly permissive IAM roles, exposed database credentials, or unsecured container orchestration can provide a backdoor to the model or the data it processes.
  • Data Aggregation Points: Cloud systems centralize user data for processing and potential retraining. This makes the cloud storage a high-value target for data exfiltration or large-scale data poisoning attacks.
# Pseudocode: Interacting with a cloud-based model via API
import requests
import json

API_URL = "https://api.example-ai.com/v1/image-classifier"
API_KEY = "your_secret_api_key_here" # A primary target for theft
HEADERS = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}

def classify_image_from_cloud(image_path):
    # Data is sent over the network to the server
    with open(image_path, "rb") as f:
        image_data = f.read()
    
    response = requests.post(API_URL, headers=HEADERS, data=image_data)
    
    if response.status_code == 200:
        return response.json() # Result is received from the server
    else:
        return {"error": response.text}

Edge AI: The Distributed Outpost

In an edge computing architecture, the AI model runs directly on the end-user device—a smartphone, a smart camera, an industrial sensor, or a vehicle. There is no mandatory network call to a central server for inference. This approach is favored for applications requiring low latency (e.g., autonomous driving), offline functionality, or enhanced data privacy (e.g., processing health data on-device).

For a red teamer, the game changes completely. The network perimeter is less relevant for the core inference process. The target is now the device itself. You have moved from attacking a fortress to infiltrating thousands of distributed outposts.

Attack Surface of Edge AI

  • Physical Access: If you can get your hands on the device, you may be able to extract the model. This includes hardware tampering, JTAG debugging, or side-channel attacks (e.g., power analysis) to reverse-engineer the model’s architecture and weights.
  • Local Storage: The model file (e.g., a .tflite or .onnx file) must be stored on the device. Is it encrypted? Are file permissions properly set? An attacker with root access to the device can likely steal the model directly.
  • Software Vulnerabilities: Exploiting a vulnerability in the device’s operating system or the application hosting the model can grant you the privileges needed to access and manipulate the model files or its runtime memory.
  • Sensor Manipulation: Since the model is processing data directly from on-device sensors (camera, microphone), you can focus on crafting physical adversarial examples. For a smart camera, this could be a printed patch; for a voice assistant, it could be an inaudible audio command.
# Pseudocode: Interacting with an edge-based model
import tflite_runtime.interpreter as tflite
import numpy as np

# The model file is a local asset, a primary target for extraction
MODEL_PATH = "/local/storage/models/classifier.tflite"

def classify_image_on_edge(image_data):
    # Load the model from the device's filesystem
    interpreter = tflite.Interpreter(model_path=MODEL_PATH)
    interpreter.allocate_tensors()

    input_details = interpreter.get_input_details()
    output_details = interpreter.get_output_details()

    # Inference happens entirely on the device
    interpreter.set_tensor(input_details[0]['index'], image_data)
    interpreter.invoke()
    
    output_data = interpreter.get_tensor(output_details[0]['index'])
    return output_data

Comparison of Cloud vs. Edge AI Attack Surfaces Cloud-Based AI Device Cloud Server Network MITM API Abuse Infrastructure Exploit Edge AI Device Model File Physical Access Model Extraction Software Exploit

Red Team Strategy: A Comparative Summary

Your initial reconnaissance must determine the system’s architecture. Is the app you’re testing making network calls to a known AI service provider’s domain? Or does it bundle a large model file within its own package? The answer to this question defines your entire engagement plan.

Security Dimension Cloud-Based AI Edge AI
Primary Attack Vector Network and API-based attacks. Focus on web application security, cloud configuration, and sniffing traffic. Device-centric attacks. Focus on physical access, local software exploitation, and reverse engineering.
Model Extraction Risk Low. The model is a black box, protected by the provider’s infrastructure. Extraction requires a significant breach of the cloud environment. High. If an attacker gains privileged access to the device, they can often directly copy the model file.
Data Privacy Concerns High. Sensitive user data is transmitted and stored centrally, creating a single point of failure and a valuable target for attackers. Lower. Data is processed locally, reducing exposure. Privacy is compromised only if the specific device is compromised.
Scalability of Attack High. A single vulnerability in the central API or cloud infrastructure can compromise all users of the service simultaneously. Low. An attack typically compromises a single device. Scaling the attack requires compromising many individual devices.
Detection & Monitoring Centralized. The provider can implement robust logging, intrusion detection, and anomaly detection across all API traffic. Decentralized. Monitoring is difficult and relies on on-device security agents, which may not exist or may be disabled by the attacker.

Finally, be aware of hybrid systems. Many applications use a combination of both architectures. For example, a simple “wake word” detection might run on the edge, which then triggers a more complex query to be sent to the cloud. These hybrid models present a combined attack surface, allowing you to chain vulnerabilities. You might exploit the edge component to gain information that helps you attack the cloud backend, or vice versa. As a red teamer, your job is to understand this flow of data and inference to identify the weakest link in the chain.