EAAPL-SEC009Proven

11 signals→

AI Data Classification

AI SecurityAPRA CPS234EU AI ActField-tested in AU

[EAAPL-SEC009] AI Data Classification

Category: Security / Data Governance Sub-category: Classification Enforcement Version: 1.1 Maturity: Proven Tags: data-classification sensitivity-labels rag-security cross-boundary-controls data-governance output-labelling Regulatory Relevance: Australian Privacy Act APP 11, GDPR Art. 25, EU AI Act Art. 10, APRA CPS234, ISO 27001 A.8.2, NIST AI RMF MAP 1.5

1. Executive Summary

AI Data Classification defines the automated detection, labelling, and enforcement architecture that ensures data entering and exiting AI systems carries appropriate sensitivity labels — and that those labels are enforced at every processing stage. Unlike traditional data classification for document management, AI systems introduce unique challenges: data from multiple classification levels may be mixed in a single prompt (e.g., a RAG system injecting both public and confidential documents), model outputs may inherit the classification of their inputs, and classification must be enforced at machine speed with sub-100ms latency.

For business leaders, the risk without classification enforcement is concrete: a RAG system that retrieves documents across sensitivity boundaries can expose CONFIDENTIAL information to users with only PUBLIC clearance. An AI model processing RESTRICTED data may route that data to an external model provider in violation of data sovereignty requirements. Output labelling tells downstream systems how to handle AI-generated content appropriately.

This pattern covers: automated classification of data entering AI systems, classification enforcement in RAG retrieval pipelines, cross-boundary controls that prevent data from flowing to inappropriate model providers, output labelling of AI-generated content, and the integration of classification decisions with the AI Gateway's routing and policy engine. It is the data governance foundation that makes all other AI security controls classification-aware.

2. Problem Statement

Business Problem

Organisations have invested significant effort in classifying their data — documents labelled CONFIDENTIAL, RESTRICTED, or PUBLIC; databases with column-level sensitivity tags; data catalogs with lineage and classification metadata. However, when AI systems enter the picture, these classifications are often ignored: documents of all classification levels are embedded into a vector database without sensitivity labels, RAG systems retrieve and inject them into prompts without checking the requester's clearance level, and model outputs inherit no classification from the sensitive documents they summarise.

Technical Problem

AI-specific classification challenges:

Prompt classification: A prompt assembled from a PUBLIC system message, a CONFIDENTIAL user query, and a RESTRICTED document chunk should be classified at the highest level of any component (RESTRICTED). Without automated classification, this labelling does not occur.
RAG cross-boundary retrieval: Semantic search by its nature retrieves documents based on relevance, not classification. A query from a PUBLIC-clearance user can retrieve RESTRICTED documents if the retrieval layer does not enforce classification boundaries.
Output classification inheritance: A model output that summarises a CONFIDENTIAL document is at least CONFIDENTIAL. Without output classification labelling, this content may be logged, shared, or processed without appropriate protections.
Model routing: Data at certain classification levels must not be sent to external cloud model providers (data sovereignty or contractual restriction). Without classification on the prompt, the AI Gateway cannot enforce routing decisions.

Symptoms

RAG systems retrieving documents without classification filtering on the query.
Prompts containing CONFIDENTIAL data being routed to external model APIs.
Model outputs with no sensitivity label, processed as PUBLIC by downstream systems.
No mechanism to enforce "RESTRICTED data → on-premises model only" routing policy.
Multiple classification levels mixed in a single vector index with no per-document labelling.

Cost of Inaction

Dimension	Impact
Regulatory	CONFIDENTIAL or RESTRICTED data reaching external model providers — potential data sovereignty violation
Privacy	CONFIDENTIAL personal data retrieved and exposed to unauthorised users via RAG
Security	Sensitive data appearing in logs or downstream systems without appropriate protective controls
Legal	Contractual violations from data handling requirements not being met
Operational	No basis for auditing what classification levels of data were processed by AI systems

3. Context

When to Apply

Any AI system with a RAG component that retrieves from a mixed-classification document corpus.
Organisations with data classification policies that must be enforced in AI pipelines.
AI systems whose prompts include data from external sources (uploaded documents, database records, email content).
Regulated data environments where data sovereignty (geography/provider) restrictions apply.
AI systems that generate content that feeds downstream systems with classification-sensitive handling requirements.

When NOT to Apply

AI systems operating exclusively on a single classification level of data (all PUBLIC, or all CONFIDENTIAL in an environment where all processing is CONFIDENTIAL-cleared).
Proof-of-concept systems not processing real enterprise data.

Prerequisites

Prerequisite	Detail
Data Classification Schema	Organisation's classification taxonomy (typically 3–5 levels)
Data Catalogue / Labels	Existing classification metadata on documents and data sources
AI Gateway (EAAPL-SEC001)	Classification labels consumed by gateway for routing and policy decisions
Vector Database (RAG)	Vector store that supports per-document metadata and filtered retrieval
Identity Context	User clearance level available in the request context for cross-boundary enforcement

Industry Applicability

Industry	Applicability	Key Driver
Government / Defence	Critical	Statutory classification regimes; information compartmentalisation
Financial Services	Critical	APRA CPS234; classification of customer data, trading data, regulatory submissions
Healthcare	Critical	PHI classification; HIPAA/Privacy Act enforcement
Legal	High	Attorney-client privilege; matter-level classification
Technology	High	IP classification; customer data classification
Retail	Medium	PII classification; payment card data

4. Architecture Overview

The classification architecture operates at four distinct points in the AI pipeline: (1) at data ingestion into AI stores, (2) at retrieval from RAG systems, (3) at prompt assembly, and (4) at output generation. Together these four controls create a classification-aware AI pipeline where no data flows without its sensitivity label.

Classification at Ingestion

When documents or records are ingested into AI data stores (vector databases, RAG corpora, fine-tuning datasets), they carry classification metadata from the source system. The ingestion pipeline:

Reads classification metadata from the document management system or data catalogue.
If no metadata is present, applies the automated classifier to assign a classification level.
Stores classification labels alongside the document embedding in the vector database's metadata fields.
Enforces immutable classification labels — once set, classification can only be elevated (increased), never reduced, without human approval and audit.

Classification Enforcement in RAG Retrieval

When a query is made against the vector database, the retrieval layer applies classification-based filtering:

The query context carries the requesting user's clearance level.
Vector search is filtered to return only documents at or below the user's clearance level (using the vector database's metadata filter capability).
Documents above the clearance level are excluded from retrieval results regardless of semantic relevance.
If no documents within the clearance level are relevant, the response acknowledges the limitation rather than elevating the user's access.

Prompt Assembly Classification

When the AI Gateway assembles the final prompt (combining system message, user input, and RAG context), it:

Computes the composite classification of the assembled prompt: maximum of all component classifications.
Attaches the classification label to the prompt as gateway metadata.
The routing policy engine uses this classification to select the appropriate model: PUBLIC/INTERNAL → external cloud model allowed; CONFIDENTIAL/RESTRICTED → on-premises or private cloud model only.

Output Classification Labelling

Model outputs are labelled with the maximum classification of the inputs used to generate them. This label is:

Attached as a response header by the AI Gateway.
Written to the audit log alongside the response.
Used by output storage systems to apply appropriate retention and access controls.
Presented to downstream automated systems so they handle the content with appropriate protections.

5. Architecture Diagram

ARCHITECTURE DIAGRAM

flowchart TD subgraph Ingestion["Data Ingestion"] A[Source Documents] B[Classification Labeller] C[Vector Store + Labels] end subgraph Gateway["AI Gateway"] D[User Query + Clearance] E[Classification Filter] F{Prompt Class Level} end subgraph Output["Output + Audit"] G[Model Response] H[Output Label] I[Audit Log] end A --> B --> C D --> E -->|clearance-filtered| C C --> F F -->|public/internal| G F -->|restricted| J[Blocked] G --> H --> I style A fill:#dbeafe,stroke:#3b82f6 style B fill:#f0fdf4,stroke:#22c55e style C fill:#fef9c3,stroke:#eab308 style D fill:#dbeafe,stroke:#3b82f6 style E fill:#f0fdf4,stroke:#22c55e style F fill:#f3e8ff,stroke:#a855f7 style G fill:#d1fae5,stroke:#10b981 style H fill:#d1fae5,stroke:#10b981 style I fill:#fef9c3,stroke:#eab308 style J fill:#fee2e2,stroke:#ef4444

6. Components

Component	Type	Responsibility	Technology Options	Criticality
Classification Labeller	NLP + Rules	Assigns classification labels to documents lacking metadata; validates existing labels	AWS Comprehend Custom Classifier, Azure AI Language, custom rules + NER	High
Vector DB with Metadata Filtering	Storage	Stores embeddings with classification labels; enforces metadata filters on queries	Pinecone (metadata filter), Weaviate (where filter), pgvector (WHERE clause), Qdrant	Critical
Classification Filter (RAG)	Access Control	Enforces clearance-level filtering on all vector retrievals	Custom query interceptor; DB-level WHERE clause on classification_level	Critical
Composite Classifier	Logic	Computes maximum classification of assembled prompt from all components	Custom logic (max of component labels)	Critical
Routing Policy	Gateway Policy	Routes requests to model providers based on prompt classification level	OPA policy in AI Gateway; routing rules	Critical
Output Classifier	Labelling	Labels model output with classification inherited from input	Custom logic + AI Gateway response enrichment	High
Classification Metadata Store	Catalogue	Authoritative mapping of data source → classification level	Data catalogue (Collibra, Alation), custom database	High
Audit Trail	Compliance	Records classification decisions, routing decisions, and output labels	Same pipeline as AI Gateway audit log	Critical

7. Data Flow

Primary Flow

Step	Actor	Action	Output
1	Ingestion Pipeline	Reads source document with classification metadata; if absent, applies auto-classifier	Document + classification label
2	Vector DB Ingestion	Stores embedding with classification label in metadata field	Searchable embedding with `classification_level` metadata
3	User Query	Query arrives at RAG layer with user clearance level in context	Query + clearance level
4	Classification Filter	Constructs metadata filter: `classification_level <= user_clearance_level`	Filtered vector search query
5	Vector DB	Executes filtered ANN search	Results: only documents at/below user's clearance
6	Composite Classifier	Computes max(system_msg_class, user_input_class, rag_context_class)	Composite classification label for assembled prompt
7	Routing Policy	Evaluates classification level against routing policy	Route decision: cloud provider or on-premises model
8	AI Gateway	Routes to appropriate model; attaches classification context to request	Classified request at model endpoint
9	Output Labeller	Labels response with composite input classification	Labelled model response
10	Audit Logger	Records classification decisions, routing, input classification, output classification	Immutable classification audit record

Error Flow

Error	Handling	Status	Alert
Document without classification label in ingestion	Fail ingestion; alert data steward	400	Warning: unlabelled document blocked from AI store
Query clearance level not present in context	Fail-safe: return no results	403	Security: missing clearance context
Classification label mismatch (auto vs stored)	Log discrepancy; use higher classification	Warning in audit	Warning: classification discrepancy
RESTRICTED/SECRET data attempted routing to external provider	Block request; log violation	403	Critical: classification boundary violation
Output classification inheritance fails	Label as highest input classification	—	Warning: output classification fallback

8. Security Considerations

Authentication & Authorisation

User clearance level must be attested by the Identity Provider (not self-declared by the application). Clearance is a claim in the JWT, signed by the IdP.
Classification label updates require authenticated, authorised data steward identity — not application-level writes.

Data Classification

The pattern itself defines and enforces data classification — it is both the policy and the enforcement mechanism.

OWASP LLM Top 10 Coverage

OWASP LLM Risk	Data Classification Mitigation	Coverage
LLM01: Prompt Injection	Classification-aware RAG filtering limits the documents an attacker can poison	Medium
LLM02: Insecure Output Handling	Output labels instruct downstream systems on appropriate handling	High
LLM03: Training Data Poisoning	Classification controls on training data ingestion	Medium
LLM04: Model Denial of Service	Not applicable	None
LLM05: Supply Chain Vulnerabilities	Classification ensures only correctly labelled data enters AI pipelines	Medium
LLM06: Sensitive Information Disclosure	Cross-boundary controls prevent CONFIDENTIAL/RESTRICTED data from reaching external providers	Critical
LLM07: Insecure Plugin Design	Tool outputs classified before returning to agent context	High
LLM08: Excessive Agency	Agent clearance level limits which data it can retrieve and act on	High
LLM09: Overreliance	Not applicable	None
LLM10: Model Theft	Data classification in retrieval prevents model training data extraction by limiting retrievable corpus	Medium

9. Governance Considerations

Governance Artefacts

Artefact	Owner	Frequency	Purpose
Classification Schema	Data Governance	Reviewed annually	Defines classification levels and handling requirements for AI contexts
RAG Corpus Classification Audit	Data Stewards	Quarterly	Verifies all documents in RAG corpus are correctly labelled
Routing Decision Log	AI Platform	Monthly review	Evidence of classification-based routing enforcement
Cross-Boundary Violation Log	Security Operations	Continuous	Records all attempts to route above-threshold data to external providers
Output Label Distribution Report	AI Governance	Monthly	Reports distribution of output classification levels by application

10. Operational Considerations

SLOs

SLO	Target	Measurement
Classification filter enforcement accuracy	100% (zero cross-boundary retrievals)	Monthly test: query for RESTRICTED doc with PUBLIC clearance
Auto-classification accuracy	>95% on validation set	Quarterly evaluation against labelled test set
Classification metadata filter latency overhead	<5ms	Vector DB query latency with/without filter comparison
Routing decision latency overhead	<2ms	OPA routing decision span

Incident Management

Cross-boundary violation (RESTRICTED data routed to external provider) → P1 privacy/security incident.
RAG corpus with unlabelled documents → P2: block new queries until audit complete.
Classification metadata corruption → P2: reindex affected documents.

11. Cost Considerations

Cost Drivers

Cost Driver	Description	Relative Impact
Auto-classification compute	NLP model for labelling new documents	Medium
Vector DB metadata storage overhead	Metadata fields add modest storage overhead	Very Low
Data steward time	Human review for auto-classification edge cases	Medium
On-premises model infrastructure	CONFIDENTIAL/RESTRICTED data requires on-premises model — significant cost increase	High

Indicative Cost Range

Scale	Monthly Cost (USD)	Notes
Small corpus (< 100K docs)	$200–$500	Auto-classification model; metadata overhead
Large corpus (1M+ docs)	$1,000–$4,000	Batch classification; ongoing auto-labelling
With on-premises model requirement	+$5,000–$30,000	Additional on-premises inference infrastructure for CONFIDENTIAL/RESTRICTED

12. Trade-Off Analysis

Option Comparison

Option	Description	Pros	Cons	Best For
A: Manual classification only	Human data stewards label all documents	High accuracy; clear accountability	Doesn't scale to large corpora; delays AI onboarding of new data	Small, stable document sets
B: Auto-classification only	ML classifier assigns all labels	Scales to any corpus size; fast	False positives/negatives require human review process	Large corpora with regular updates
C: Hybrid (this pattern)	Auto-classify new documents; human review for high-sensitivity; periodic audit	Scales well; human oversight for critical decisions	Operational process required for review queue	Production enterprise AI
D: Source-system classification only	Trust classification labels from source system (DMS, data catalogue)	Zero additional classification effort; authoritative	Source systems may have incomplete or inconsistent labelling	Mature data governance environments

13. Failure Modes

Failure	Likelihood	Impact	Detection	Recovery
Auto-classifier under-classifies document	Medium	Critical (CONFIDENTIAL doc classified as INTERNAL → external routing allowed)	Quarterly audit of auto-classified documents	Re-classify; investigate false-negative pattern; update classifier
Clearance level missing from request context	Medium	High (fail-safe blocks query; user impact)	403 rate spike	Fix IdP claim mapping; interim manual clearance injection
Vector DB metadata filter bug	Low	Critical (cross-boundary retrieval)	Monthly automated test	Hotfix filter; re-run queries with correct filter
Classification metadata corruption	Low	High (unpredictable routing decisions)	Classification audit anomaly	Reindex from source classification metadata

14. Regulatory Considerations

Regulation	Requirement	Implementation
Australian Privacy Act APP 11	Take reasonable steps to protect personal information	Classification of personal information + routing controls to prevent unauthorised external processing
GDPR Art. 25 (Privacy by Design)	Data minimisation and purpose limitation	Classification-based routing ensures personal data is processed only by appropriate, privacy-compliant models
EU AI Act Art. 10 (Training Data)	Data governance requirements for high-risk AI training	Classification controls at data ingestion implement governance requirements
ISO 27001 A.8.2 (Information Classification)	Classify and label information	AI-specific extension of A.8.2 to cover AI data stores and pipeline data
APRA CPS234 §21	Information security commensurate with sensitivity	Classification-based routing directly implements sensitivity-proportionate controls

15. Reference Implementations

AWS

Component	AWS Service
Auto-classification	Amazon Comprehend Custom Classifier
Vector DB with metadata filter	Amazon OpenSearch Service (k-NN with metadata filter)
Classification metadata	AWS Glue Data Catalogue
Routing policy	IAM + AWS Bedrock model access controls
Audit	CloudTrail + Macie (for S3-stored documents)

Azure

Component	Azure Service
Auto-classification	Azure AI Language Custom Classification
Vector DB	Azure AI Search (filter by classification metadata)
Classification metadata	Microsoft Purview
Routing policy	Azure APIM policy with classification header routing
Audit	Azure Monitor + Purview data insights

On-Premises

Component	Technology
Auto-classification	Custom spaCy/Transformers classifier
Vector DB	Weaviate (where filter) or pgvector (WHERE clause)
Classification metadata	Internal data catalogue (Collibra, custom)
Routing policy	OPA in AI Gateway
Audit	Kafka → Elasticsearch

Pattern	ID	Relationship
AI Gateway	EAAPL-SEC001	Classification labels consumed by gateway routing policy
LLM Input Sanitisation	EAAPL-SEC005	SEC005 uses classification to determine redaction intensity
AI Output Filtering	EAAPL-SEC006	Output classification from SEC009 informs SEC006 output policy
Zero-Trust AI Pipeline	EAAPL-SEC007	Classification is an additional dimension of the per-request authorisation policy
Secrets Management for AI	EAAPL-SEC008	Secret paths in vault are classification-tagged

17. Maturity Assessment

Overall Maturity: Proven

Dimension	Score (1–5)	Rationale
Pattern definition clarity	4	AI-specific classification challenges are well-understood; some edge cases still evolving
Technology availability	4	Vector DB metadata filtering is production-ready; auto-classification tooling mature
Industry adoption	3	Traditional classification is mature; AI-specific RAG classification enforcement is newer
Regulatory alignment	5	Strong alignment with Privacy Act, GDPR, APRA, ISO 27001
Operational tooling	3	Classification audit tooling for AI corpora requires custom implementation

18. Revision History

Version	Date	Author	Changes
1.0	2024-04-10	Security Architecture Team	Initial pattern definition
1.1	2024-11-01	Security Architecture Team	Added RAG cross-boundary retrieval enforcement detail; output labelling; updated regulatory mapping

Track this pattern for APRA/ASIC review

← Back to Library More AI Security →