EAAPL-SEC003Proven

Model Isolation

AI SecurityAPRA CPS234EU AI Act

[EAAPL-SEC003] Model Isolation

Category: Security / Compute Isolation Sub-category: Blast Radius Limitation Version: 1.2 Maturity: Proven Tags: isolation sandboxing network-segmentation process-isolation resource-quotas egress-control zero-trust Regulatory Relevance: APRA CPS234 §21, EU AI Act Art. 9, ISO 27001 A.13.1, NIST AI RMF MANAGE 2.2

1. Executive Summary

Model Isolation defines the architectural pattern for constraining the execution environment of AI models — whether hosted internally or accessed via API — to limit the blast radius of a model compromise, data exfiltration attempt, or misconfigured model workload. It treats every model execution environment as a potential attack surface and applies defence-in-depth isolation controls at the network, process, storage, and identity layers.

For executives, the business case is straightforward: AI models represent a new class of compute workload with unique risk characteristics. Unlike a web server that executes deterministic code, an LLM or ML model can be manipulated by adversarial inputs to produce unexpected outputs, access data it should not access, or generate outputs that cause downstream harm. If a model execution environment is not properly isolated, a compromised model can become a pivot point for lateral movement across the enterprise network, access secrets or databases it should never reach, or exfiltrate data through its outputs.

This pattern is especially critical for organisations running on-premises model inference, fine-tuning workloads, or AI agents with tool access. It is equally relevant as a design requirement when evaluating model hosting vendors: the vendor's isolation architecture should be reviewed against this pattern before procurement decisions.

2. Problem Statement

Business Problem

Enterprises running AI model workloads face a novel risk: a model serving network requests is a stateful, long-running process that processes inputs from potentially adversarial sources. Unlike a stateless API endpoint, a model's behaviour can be influenced by its inputs in complex ways. If the model process runs with broad network access, file system access, or cloud IAM permissions, a successful adversarial input or model misconfiguration can lead to data exfiltration, lateral movement, or privilege escalation.

Technical Problem

AI model serving processes — whether Python (PyTorch, Transformers, vLLM), Go-based inference servers, or containerised model endpoints — typically run with more permissions than required:

Network access to all subnets (allowing lateral movement if compromised).
File system access to model weights, configuration, and sometimes application data.
Cloud IAM roles with broad permissions (inherited from the host's instance profile).
Outbound internet access (a model can exfiltrate data through HTTP calls in a tool-enabled agentic context).
No resource quotas (a single runaway inference job can starve other workloads).

Symptoms

Model serving processes running as root or with excessive IAM permissions.
No network segmentation between model servers and sensitive data stores.
Model weights stored on writable file systems (enabling weight poisoning).
No egress controls on model serving infrastructure.
Unrestricted resource usage by individual inference jobs.

Cost of Inaction

Dimension	Impact
Security	Compromised model server enables lateral movement to databases, secrets stores, or cloud control plane
Data	Adversarial inputs causing model to exfiltrate data from its context through tool calls or logging
Regulatory	APRA CPS234 requires controls commensurate with information security risks — ungoverned model execution is a cited gap
Operational	Runaway inference jobs cause resource exhaustion and service degradation for other workloads
Financial	Unrestricted egress from model servers can lead to data exfiltration costs and regulatory fines

3. Context

When to Apply

Any on-premises or cloud-hosted AI model inference workload.
AI agents with tool access (the consequences of a compromised agent are significantly higher than a passive model).
Fine-tuning workloads that process proprietary or sensitive training data.
Multi-tenant AI platforms where multiple teams or customers share model infrastructure.
RAG systems where the model has access to a document retrieval layer.

When NOT to Apply

External API-only model usage (Azure OpenAI, Anthropic Claude via API) — isolation is the provider's responsibility; however, this pattern informs the contractual and audit questions to ask the provider.
Single-developer local experimentation environments.

Prerequisites

Prerequisite	Detail
Container/VM infrastructure	Kubernetes, ECS, or VM-based deployment required for isolation controls
Network segmentation capability	VPC/VNet subnetting; security groups or network policies
Secrets management	Vault or cloud-native secrets manager for model credentials
IAM maturity	Ability to create fine-grained service accounts/roles for model workloads
Monitoring stack	Process-level and network-level monitoring for anomaly detection

Industry Applicability

Industry	Applicability	Key Driver
Financial Services	Critical	Data sovereignty; lateral movement risk to core banking systems
Healthcare	Critical	Patient data protection; PHI access controls
Government / Defence	Critical	Classified data segregation; adversarial threat model
Technology / SaaS	High	Multi-tenant isolation; intellectual property protection
Manufacturing / Industrial	High	OT/IT boundary protection; model access to operational data
Retail	Medium	PII protection; model access to customer data stores

4. Architecture Overview

Model isolation is implemented as a set of concentric isolation boundaries — each layer reduces the blast radius of a compromise at the layer above. The architecture philosophy is: assume the model is compromised; design the environment so that a compromised model cannot reach anything of value.

Network Isolation

Model serving workloads are deployed in a dedicated, isolated network segment (VPC subnet, Kubernetes namespace with NetworkPolicy, or dedicated VLAN). This segment has no direct connectivity to:

Core databases (customer data, financial records).
Secrets stores (vault, secrets manager).
Internal corporate networks.
Internet (unless explicitly permitted by egress policy).

Inbound traffic reaches the model only from the AI Gateway (EAAPL-SEC001) via a specific port. All other inbound traffic is denied. Outbound traffic is restricted to: the model registry (to fetch weights), the telemetry endpoint (to ship metrics/logs), and explicitly allowlisted tool endpoints (for agentic use cases). A DNS sinhole or DNS firewall prevents the model from resolving arbitrary internet hostnames.

Process Isolation

Model serving processes run with the minimum operating system permissions required:

Non-root user (UID 1000+).
Read-only root filesystem (model weights and configuration are mounted read-only).
No CAP_SYS_ADMIN or other privileged Linux capabilities.
Seccomp profile restricting available system calls to those required for inference.
AppArmor or SELinux policy enforcing the process's access to file system paths.
No access to the host network namespace (container network only).

In Kubernetes, this is implemented via a PodSecurityPolicy (or Pod Security Admission in modern Kubernetes) with runAsNonRoot: true, readOnlyRootFilesystem: true, allowPrivilegeEscalation: false, and a custom seccomp profile.

Resource Quotas

Runaway inference jobs cause denial of service. Resource quotas are enforced at:

Container level: CPU requests/limits, memory requests/limits, GPU memory limits.
Kubernetes namespace level: ResourceQuota objects limiting total CPU, memory, and GPU across all pods.
Per-request level: token limits enforced by the serving layer (vLLM max_tokens, TGI max_new_tokens).

Resource quotas protect not only the model infrastructure but also adjacent workloads sharing the cluster.

Secret Access Minimisation

The model serving process requires no secrets beyond what is needed to authenticate to the model registry and emit telemetry. It does not hold database credentials, user API keys, or service-to-service tokens. Any secrets required are:

Injected at startup via a sidecar (Vault Agent, AWS Secrets Manager CSI driver) and expire after use.
Never stored in environment variables (accessible to any process in the container).
Never stored in the model's context window.

Read-Only Model Weights

Model weight files are mounted read-only. This prevents weight poisoning attacks (where an attacker with write access to the model's file system can modify weights to alter model behaviour). Weights are loaded from a signed, immutable artefact store (container registry with image signing, S3 with Object Lock) and verified at startup using a cryptographic hash.

Egress Controls for Agentic Systems

For AI agents with tool access, egress is the highest-risk attack surface. A compromised agent can use legitimate tool calls to exfiltrate data. Egress controls implement:

An explicit tool endpoint allowlist enforced at the network layer (not just application layer).
Rate limits on tool call frequency to limit exfiltration bandwidth.
Deep packet inspection on HTTP tool calls (payload inspection for data exfiltration patterns).
Tool call audit logging to detect anomalous patterns.

5. Architecture Diagram

ARCHITECTURE DIAGRAM

flowchart TD subgraph External["External Zone"] A[AI Gateway] end subgraph ModelZone["Isolated Model Zone"] B[Network Policy] C[Inference Process] D[Secret Injector Sidecar] E[Resource Quota] end subgraph Egress["Allowlisted Egress"] F[Model Registry] G[Telemetry Endpoint] end subgraph Denied["Denied Zone"] H[Databases + Secrets] end A -->|mTLS inbound only| B --> C D -->|startup secrets only| C E -.->|enforces limits| C C --> F C --> G B -->|deny all other| H style A fill:#dbeafe,stroke:#3b82f6 style B fill:#f0fdf4,stroke:#22c55e style C fill:#fef9c3,stroke:#eab308 style D fill:#f0fdf4,stroke:#22c55e style E fill:#f0fdf4,stroke:#22c55e style F fill:#fef9c3,stroke:#eab308 style G fill:#fef9c3,stroke:#eab308 style H fill:#fee2e2,stroke:#ef4444

6. Components

Component	Type	Responsibility	Technology Options	Criticality
Network Policy	Network Control	Restricts inbound/outbound traffic for model serving pods to allowlisted endpoints	Kubernetes NetworkPolicy, AWS Security Groups, Calico, Cilium	Critical
Pod Security Controls	Process Isolation	Enforces non-root execution, read-only filesystem, capability restrictions, seccomp profile	Kubernetes PodSecurity Admission, OPA Gatekeeper, Kyverno	Critical
Secret Sidecar	Secrets Management	Injects required secrets at startup; rotates and expires credentials; never persists secrets to disk	Vault Agent sidecar, AWS Secrets Manager CSI driver, Azure Key Vault CSI driver	Critical
Resource Quota	Resource Control	Limits CPU, memory, GPU consumption per pod and per namespace	Kubernetes ResourceQuota + LimitRange, Slurm (HPC), AWS Fargate resource limits	High
Model Registry	Artefact Store	Stores model weights in immutable, signed artefacts; enforces content-addressed retrieval	Docker Registry + Notary/cosign, S3 with Object Lock + SHA-256 manifest, MLflow Registry	High
Weight Integrity Verifier	Integrity Check	Verifies cryptographic hash of model weights at container startup before serving begins	Cosign, custom hash verification script in init container	High
Egress Controller	Network Control	Enforces outbound connection allowlist; optionally performs deep packet inspection on tool calls	Envoy egress proxy, Squid with allowlist, AWS VPC Endpoints, Cilium egress gateway	High
Log Sidecar	Observability	Collects process logs, system call traces, and network connection logs; forwards to SIEM	Fluentd, Fluent Bit, Datadog Agent, AWS FireLens	High
Seccomp Profile	OS Hardening	Restricts Linux system calls available to the inference process	Custom seccomp JSON profile, Docker default seccomp, Bottlerocket cgroups v2	Medium
AppArmor / SELinux Policy	OS Hardening	Mandatory access control enforcing file system and capability boundaries	AppArmor (Ubuntu/Debian), SELinux (RHEL/Amazon Linux)	Medium

7. Data Flow

Primary Flow

Step	Actor	Action	Output
1	DevOps / MLOps	Publishes model weights to model registry with cosign signature	Signed, immutable model artefact with SHA-256 digest
2	Container Orchestrator	Schedules model serving pod in isolated namespace; applies NetworkPolicy and PodSecurity constraints	Pod scheduled on dedicated node pool with isolation labels
3	Init Container	Fetches model weights from registry; verifies cosign signature and SHA-256 hash	Verified weights mounted at read-only path
4	Secret Sidecar	Authenticates to Vault using Kubernetes Service Account token; retrieves telemetry credentials; injects into shared memory	Short-lived credentials available to inference process
5	Inference Process	Starts serving; accepts inbound requests only from AI Gateway over mTLS	Model ready to serve
6	AI Gateway	Forwards validated, sanitised request to model	Request received by inference process
7	Inference Process	Runs inference; generates response; for agentic workloads, makes tool calls only to allowlisted endpoints	Response or tool call output
8	Log Sidecar	Collects process logs, resource metrics, and network connection events; forwards to telemetry endpoint	Observability data available in SIEM/monitoring stack
9	Resource Quota Controller	Enforces CPU/memory/GPU limits; throttles or terminates if limits exceeded	Normal operation or throttle/OOMKill event

Error Flow

Error Condition	Behaviour	Alert
Weight integrity check fails	Pod fails to start; alert MLOps team	Critical: model weight integrity violation
Secret sidecar cannot authenticate to Vault	Pod fails to start; no credentials available	Critical: secret injection failure
Network policy violation attempt	Connection rejected by Kubernetes NetworkPolicy; logged by Cilium/Calico	Security: model attempting disallowed egress
Resource quota exceeded	Pod throttled (CPU) or OOMKilled (memory); pod restarted	Warning: resource exhaustion
Seccomp violation (blocked syscall)	Process terminated with SIGSYS; pod restarted	Security: unexpected syscall from model process

8. Security Considerations

Authentication & Authorisation

Model serving process has no inbound authentication to manage (auth handled by AI Gateway before request reaches model).
Outbound authentication for tool calls uses short-lived tokens injected by the secret sidecar — never long-lived credentials embedded in configuration.
Kubernetes Service Account tokens used for Vault authentication are bound to the specific pod's namespace and expire within 1 hour.

Secrets Management

No secrets in environment variables (visible in container inspect, logs, crash dumps).
No secrets in model weights or configuration files.
Secret sidecar injects credentials into in-memory tmpfs only.
All credential access logged by Vault for audit.

Data Classification

Model execution environment is classified at the sensitivity level of the highest-classification data it will process. A model serving requests containing CONFIDENTIAL data must be isolated in a CONFIDENTIAL-tier network segment.
Cross-classification boundary serving is prohibited — a model serving CONFIDENTIAL requests must not also serve PUBLIC requests (context window contamination risk).

Encryption

Model weights encrypted at rest in registry (AES-256, provider-managed key) and in transit (TLS 1.3 from registry to pod).
Network traffic within the pod is encrypted using Kubernetes pod-to-pod mTLS (Istio/Linkerd) or WireGuard (Cilium).
Scratch space (for intermediate computation) uses encrypted ephemeral volumes.

Auditability

All egress connection attempts (successful and blocked) logged with source pod, destination IP/hostname, and timestamp.
All secret access events logged by Vault.
All resource quota violations logged for security review (may indicate attempted resource exhaustion attack).

OWASP LLM Top 10 Coverage

OWASP LLM Risk	Model Isolation Mitigation	Coverage
LLM01: Prompt Injection	Isolation limits blast radius if injection succeeds; does not prevent injection itself	Low
LLM02: Insecure Output Handling	Egress controls limit exfiltration of data through tool calls in agentic contexts	High
LLM03: Training Data Poisoning	Read-only model weights + weight integrity verification prevent weight-level poisoning post-deployment	High
LLM04: Model Denial of Service	Resource quotas prevent runaway inference from affecting other workloads	High
LLM05: Supply Chain Vulnerabilities	Signed model artefacts and integrity verification at startup prevent supply chain compromise of model weights	High
LLM06: Sensitive Information Disclosure	Network isolation prevents direct access to data stores; context window data cannot reach external endpoints	High
LLM07: Insecure Plugin Design	Egress allowlist enforces tool endpoint restrictions at network layer	High
LLM08: Excessive Agency	Egress controls and tool allowlist limit the actions an agent can take	High
LLM09: Overreliance	Not applicable	None
LLM10: Model Theft	Read-only filesystem; encrypted weights at rest; no external weight exfiltration path	High

9. Governance Considerations

Responsible AI

Model isolation ensures that AI model behaviour is bounded — a model cannot access data beyond its authorised scope, which is a prerequisite for responsible deployment.
Isolation boundaries must be documented in the AI system's risk register and reviewed as part of the AI impact assessment process.

Model Risk Management

Isolation controls form a critical part of the model risk management framework: they limit the operational risk from a model behaving unexpectedly.
Weight integrity verification is a model risk control — it ensures the deployed model is the validated, approved model.

Human Approval

Changes to network policy (e.g., adding a new egress allowlist entry) require approval from Security Architecture and are subject to change management.
Changes to seccomp profiles or AppArmor policies require security team review.

Governance Artefacts

Artefact	Owner	Frequency	Purpose
Model Isolation Design Document	Security Architecture	With each new model deployment	Documents isolation controls for each model environment
Network Policy Audit Report	Security Operations	Quarterly	Verifies network policies are correctly applied and not bypassed
Weight Integrity Verification Log	MLOps	Continuous	Evidence that deployed models match approved artefacts
Egress Connection Log	Security Operations	Continuous review	Detects anomalous outbound connections from model serving
Resource Quota Review	Platform Engineering	Quarterly	Ensures quotas are appropriate for workload without over-provisioning risk

10. Operational Considerations

Monitoring

Process-level: CPU, memory, GPU utilisation per inference process; seccomp violation events.
Network-level: egress connection attempts (blocked and permitted); inbound connection sources.
Storage-level: write attempts to read-only filesystem (apparmor/seccomp violation).
Resource-level: quota utilisation trends; OOMKill events.

SLOs

SLO	Target	Measurement
Weight integrity verification time	<30s at pod startup	Init container span
Secret injection latency	<5s at pod startup	Secret sidecar span
Network policy enforcement latency	<1ms per connection	Cilium/Calico metrics
Egress block alert latency	<60s from connection attempt to alert	Alert pipeline latency
Seccomp/AppArmor violation alert	<30s from violation to SIEM	SIEM ingestion latency

Logging

Structured JSON from all sidecars. Mandatory: pod_name, namespace, event_type (egress_attempt, seccomp_violation, oomkill, weight_integrity_check), outcome (allowed/blocked/failed), timestamp_utc.
Network connection logs include src_pod, dst_ip, dst_hostname, dst_port, protocol, bytes_transferred, outcome.

Incident Management

Egress connection attempt to non-allowlisted destination → P1 security incident; immediate pod isolation; security operations investigation.
Seccomp violation → P2; pod quarantined; security review of syscall.
Weight integrity failure → P1; pod does not start; MLOps escalation; artefact store integrity investigation.

DR

Scenario	RTO	Recovery
Pod OOMKilled	30s	Kubernetes restarts pod; alert to platform team
Model registry unavailable	5min (new pods cannot start; existing pods continue)	Cached weights in running pods; restore registry
Vault unavailable	2min (pods can't start or rotate secrets)	Vault HA cluster; emergency credential cache in CSI driver
Network policy misconfiguration	5min	Rollback network policy to last known-good version via GitOps

11. Cost Considerations

Cost Drivers

Cost Driver	Description	Relative Impact
Dedicated node pool	Model workloads often require GPU nodes; isolation to dedicated pools prevents bin packing with other workloads	High
Egress proxy	Envoy or Squid egress proxy adds compute cost	Low
Secret sidecar	Vault Agent or CSI driver adds memory overhead per pod	Low
Security scanning	Image scanning, seccomp profile generation, AppArmor policy authoring engineering time	Medium
GPU underutilisation	Isolation prevents sharing GPU nodes with non-model workloads	Medium–High

Optimisations

Use node affinity and taints to co-locate multiple isolated model workloads on the same GPU node while maintaining pod-level isolation — share the node's GPU, not the network or filesystem.
Implement GPU time-slicing (MIG on NVIDIA A100) to allow multiple isolated pods to share a single GPU without memory isolation risk.

Indicative Cost Range

Scale	Monthly AWS Additional Cost (USD)	Notes
Small (1–2 model endpoints)	$200–$600	Dedicated EKS node group, NAT Gateway for egress control
Medium (5–20 model endpoints)	$1,000–$4,000	Dedicated node pools; Cilium enterprise for egress; additional monitoring
Large (50+ model endpoints)	$8,000–$25,000	Multi-tenant GPU cluster with fine-grained isolation; dedicated security tooling

12. Trade-Off Analysis

Option Comparison

Option	Description	Pros	Cons	Best For
A: Namespace-only isolation	Separate Kubernetes namespace with NetworkPolicy; no process-level hardening	Low operational overhead; fast to implement	Process escapes still possible; shared kernel; no egress DPI	Dev/staging environments; low-sensitivity workloads
B: Full pod hardening (this pattern)	Namespace + process isolation (seccomp, AppArmor, non-root, read-only FS) + egress control	Comprehensive isolation; industry-standard	Requires seccomp profile authoring; AppArmor policy management; operational overhead	Production AI workloads; regulated environments
C: VM-level isolation	Each model in a dedicated VM (or Kata Containers for VM-level isolation in Kubernetes)	Kernel isolation; strongest blast radius containment	High cost; poor bin packing; slow start time	Highest-risk workloads; multi-tenant with hostile tenants
D: Managed service isolation	Use cloud-managed model serving (SageMaker, Azure ML, Vertex AI) and accept provider isolation	Low operational burden; provider SLAs	Vendor lock-in; less control; data residency constraints; can't customise seccomp	Organisations without Kubernetes expertise

Architectural Tensions

Tension	Trade-Off
Isolation vs Operability	Strict seccomp profiles and read-only filesystems can break inference libraries that write temp files. Resolution: profile the inference process's system call requirements before writing the seccomp profile; use tmpfs for scratch space.
Performance vs Security	Network policy enforcement (Cilium eBPF) and seccomp add per-request overhead. At high inference volumes, this can be measurable. Resolution: eBPF-based enforcement (Cilium) is near-zero-overhead; seccomp adds <1% CPU overhead for inference workloads.
GPU Sharing vs Isolation	GPU memory isolation requires MIG (A100/H100 only); older GPUs share GPU memory between processes. Resolution: use MIG for production; accept soft isolation (process-level) for other GPU types.

13. Failure Modes

Failure	Likelihood	Impact	Detection	Recovery
Seccomp profile too restrictive (breaks inference library)	Medium	High (model unavailable)	Pod CrashLoopBackOff; SIGSYS in logs	Audit required syscalls; update seccomp profile; redeploy
Network policy rule error (legitimate traffic blocked)	Medium	High (model unreachable from gateway)	503 errors from gateway → model; network connectivity check	Roll back network policy; investigate and fix
Weight integrity check false negative	Very Low	Critical	Post-deployment model behaviour anomaly detection	Forensic analysis of model registry; rolling restart from clean artefact
Secret sidecar certificate rotation failure	Low	High (credentials expire; model cannot authenticate for tool calls)	Secret expiry metric approaching zero	Sidecar restart; Vault token renewal
GPU memory isolation breach (non-MIG GPU)	Low	Medium (process memory accessible between pods)	Process-level memory boundary monitoring	Migrate to MIG-capable hardware; temporary: single-tenant GPU nodes

14. Regulatory Considerations

Regulation	Requirement	Model Isolation Implementation
APRA CPS234 §21	Information security controls commensurate with sensitivity	Network and process isolation directly address information asset protection
APRA CPS234 §23	Capability to detect and respond to information security incidents	Egress logging and violation alerting implement incident detection for model environments
EU AI Act Art. 9 (Risk Management)	Implement technical and organisational measures to manage AI risks	Model isolation is a core technical risk management measure for on-premises AI workloads
ISO 27001 A.13.1 (Network Security)	Manage and control networks to protect information systems	Network policy and egress control implement this requirement for AI workloads
ISO 27001 A.12.6 (Technical Vulnerability Management)	Prevent exploitation of technical vulnerabilities	Read-only filesystem and weight integrity verification address model-layer vulnerability management
NIST AI RMF MANAGE 2.2	Mechanisms exist to prevent improper access	Isolation controls implement access prevention at network, process, and storage layers

15. Reference Implementations

AWS

Component	AWS Service
Container isolation	EKS with Bottlerocket OS (seccomp by default); OPA Gatekeeper for policy
Network isolation	VPC subnets + Security Groups; EKS NetworkPolicy via Cilium or Calico
Egress control	AWS Network Firewall; VPC Endpoints for AWS services (no internet path)
Process isolation	Bottlerocket OS seccomp profiles; AWS Fargate (VM-level isolation)
Secret injection	AWS Secrets Manager CSI driver; IAM Roles for Service Accounts (IRSA)
Weight storage	ECR (OCI artefacts) with image signing (cosign); S3 with Object Lock
Resource quotas	EKS ResourceQuota + LimitRange; NVIDIA GPU Operator for GPU quotas

Azure

Component	Azure Service
Container isolation	AKS with Azure Linux (CBL Mariner); Azure Policy for pod security
Network isolation	AKS NetworkPolicy (Azure CNI or Calico); private AKS cluster
Egress control	Azure Firewall with FQDN allow rules
Secret injection	Azure Key Vault CSI driver; Workload Identity
Weight storage	Azure Container Registry with Notation signing; Azure Blob with immutability
Resource quotas	AKS ResourceQuota; Node Taints for GPU isolation

GCP

Component	AWS Service
Container isolation	GKE Autopilot (enforces security best practices by default); Workload Identity
Network isolation	GKE NetworkPolicy; Private GKE cluster; VPC Service Controls
Egress control	Cloud Armor; VPC firewall rules with FQDN
Secret injection	Secret Manager CSI driver; Workload Identity Federation
Weight storage	Artifact Registry with Binary Authorization

On-Premises

Component	Technology
Container isolation	Kubernetes with OPA Gatekeeper; custom seccomp profiles per model workload
Network isolation	Calico or Cilium NetworkPolicy; dedicated VLAN per model tier
Egress control	Envoy egress proxy with explicit upstream allowlist
Secret injection	HashiCorp Vault Agent sidecar injector
Weight storage	Harbor registry with Notary signing; Ceph S3 with WORM policies
GPU isolation	NVIDIA MIG on A100; one MIG instance per isolated model workload

Pattern	ID	Relationship
AI Gateway	EAAPL-SEC001	Gateway is the only permitted inbound path to the model; isolation enforces this at network layer
Secure Tool Invocation	EAAPL-SEC004	Egress controls in model isolation are the network-layer enforcement of tool invocation policy
Zero-Trust AI Pipeline	EAAPL-SEC007	Model isolation implements the compute-layer zero-trust controls within the broader pipeline
Secrets Management for AI	EAAPL-SEC008	Secret injection sidecar pattern depends on SEC008 for the vault infrastructure
AI Telemetry	EAAPL-OBS001	Log sidecar pattern provides the telemetry pipeline for model execution events
Adversarial Input Defence	EAAPL-SEC010	Isolation limits blast radius of adversarial inputs that succeed in manipulating model behaviour

17. Maturity Assessment

Overall Maturity: Proven

Dimension	Score (1–5)	Rationale
Pattern definition clarity	4	Well-defined; some GPU-specific isolation guidance still evolving
Technology availability	4	Kubernetes + Cilium + OPA provides complete implementation; GPU MIG requires specific hardware
Industry adoption	3	Applied in security-mature organisations; underimplemented in most enterprises deploying AI
Operational tooling	4	Strong Kubernetes security tooling ecosystem
Regulatory alignment	4	Directly addresses CPS234, EU AI Act Art. 9 requirements
Community knowledge	4	Kubernetes security community well-documented; AI-specific extensions are newer

18. Revision History

Version	Date	Author	Changes
1.0	2024-03-01	Security Architecture Team	Initial pattern definition
1.1	2024-07-15	Security Architecture Team	Added GPU MIG isolation guidance; updated OWASP LLM mapping
1.2	2025-02-01	Security Architecture Team	Added weight integrity verification; updated regulatory mapping for EU AI Act

Track this pattern for APRA/ASIC review

← Back to Library More AI Security →