EAAPL-OBS002 · Prompt Monitoring
Pattern ID: EAAPL-OBS002
Status: Proven
Complexity: Medium
Tags: observability prompt-engineering alerting pii-handling medium-complexity
Version: 1.0.0
Last Reviewed: 2026-06-12
1. Executive Summary
Prompts sent to large language models in production are the primary control surface for AI system behaviour, yet most organisations have no systematic visibility into what prompts are actually being sent, how they change over time, and whether they carry sensitive data. Prompt engineering changes are often deployed without telemetry, creating a class of silent regressions that appear as customer complaints rather than metric alerts.
This pattern defines continuous monitoring of prompts sent to LLMs in production environments. It covers: drift detection to identify unintended prompt distribution changes; anomaly alerting for PII exposure, injection attempts, jailbreak signatures, and abnormal prompt lengths; cost anomaly detection tied to prompt length trends; sensitive data protection through pre-log PII scanning; prompt version tracking with traffic distribution and performance comparison; and prompt performance analytics correlating prompt versions to success rates and user satisfaction. Together, these capabilities give engineering teams the same observability over their prompts that they expect over their code.
Target Audience: CIO, CTO, AI Engineering Lead, Platform Engineering Lead Time to Implement: 4–8 weeks
2. Problem Statement
Business Problem
Prompt engineering is the fastest-changing layer in most AI systems, yet it has no equivalent of git blame, deployment monitoring, or regression alerting. When a prompt change degrades user experience, the signal comes from user complaints or NPS drops — not from a dashboard alert within minutes of deployment. Organisations cannot demonstrate to regulators which prompt version was active when a disputed AI output was generated.
Technical Problem
Prompts are assembled dynamically from templates, retrieved context, and user inputs. The result is that no two prompts are identical, making traditional change detection (file diffing) inapplicable. Statistical monitoring is required to detect when the distribution of prompts has shifted beyond normal variance. Additionally, user inputs injected into prompts can carry PII or adversarial content that bypasses application-layer controls and reaches the model API — where it may be logged by the provider in violation of data agreements.
Symptoms
- Prompt template changes deployed to production with no performance comparison
- Prompt injection attacks detected only through customer complaints or model output review, not automated detection
- Average prompt length increasing 30% over 3 months, driving cost increases, with no alert triggered
- Data breach inquiry reveals customer PII was included in prompts sent to third-party model API
- Different API gateway replicas running different prompt versions simultaneously with no visibility
Cost of Inaction
- Silent prompt regressions persist for days to weeks, degrading user experience for all affected requests
- PII in prompts sent to third-party providers constitutes a data breach under Privacy Act APP 11
- Prompt injection attacks succeed silently, potentially exfiltrating context window data
- Inability to demonstrate version-controlled AI behaviour to regulators constitutes CPS 234 finding
3. Context
When to Apply
- Any production system using dynamic prompt templates with variable context injection
- Systems where multiple prompt versions may be active simultaneously (A/B testing, staged rollouts)
- Any system sending user-provided content as part of prompts to external model APIs
- Organisations subject to APRA, Privacy Act, EU AI Act, or internal AI governance requirements
- Prerequisites: EAAPL-OBS001 (AI Telemetry Architecture) must be in place for log ingestion
When NOT to Apply
- Systems using only static, fixed prompts with no variable content (extremely rare in practice)
- Internal developer tools where all users are trusted and PII exposure risk is accepted
- Proof-of-concept systems with < 30-day planned lifespan
Prerequisites
| Prerequisite | Required | Notes |
|---|---|---|
| EAAPL-OBS001 AI Telemetry Infrastructure | Required | Log ingestion pipeline and structured log schema required |
| Prompt template versioning system | Required | Templates must be version-tagged before monitoring is meaningful |
| Statistical analysis runtime (Python/Spark) | Required | Drift detection requires statistical compute |
| PII detection library | Required | Presidio, AWS Comprehend, or equivalent |
| Secrets management | Required | Keys must not appear in prompt logs |
Industry Applicability
| Industry | Applicability | Primary Driver |
|---|---|---|
| Financial Services | Critical | Regulatory audit, PII in prompts, version control for disputes |
| Healthcare | Critical | PHI in prompts is HIPAA/Privacy Act violation |
| Legal Services | High | Privilege leak in prompts, version accountability |
| Government | High | FOI obligations, prompt injection as attack vector |
| Retail / E-Commerce | Medium | Cost anomaly detection, personalisation prompt quality |
| Technology / SaaS | High | Multi-tenant PII separation, A/B prompt testing |
4. Architecture Overview
The Prompt Monitoring Architecture operates as an analytical overlay on the AI telemetry stream established by EAAPL-OBS001. It does not sit in the critical path of AI request processing; all analysis is performed asynchronously on telemetry data to avoid adding latency to inference calls.
Prompt Sanitisation and Metadata Capture
At the instrumentation layer, the AI Client Wrapper (from EAAPL-OBS001) captures prompt metadata before the prompt is sent to the model. The wrapper computes a SHA-256 hash of the prompt template (without variable content) to identify which template version generated the prompt. It records the template identifier, template version, and the token count of the assembled prompt. Raw prompt content is NOT logged by default. If prompt content logging is approved (for regulated audit purposes), the PII scanner runs synchronously before logging, replacing detected PII with category tokens (e.g., [PERSON_NAME], [CREDIT_CARD]).
Prompt Version Tracking
Every prompt request record includes a promptTemplateId and promptTemplateVersion, enabling the system to track which template versions are active in production at any point. A prompt version registry service maintains the authoritative mapping of templateId+version to the actual template text (stored securely, not in the telemetry stream). The registry exposes APIs used by dashboards to show: current active versions by environment, traffic distribution across versions in A/B tests, and deployment history.
Drift Detection Engine
The drift detection engine runs as a scheduled batch job (every 15 minutes for high-volume systems, hourly for lower volume). For each prompt template, it computes statistical features over the rolling window of prompt instances: mean and standard deviation of input token counts, distribution of context length, vocabulary distribution of injected user content (if content logging approved), and template version mix. These features are compared to a reference baseline established from a rolling 7-day window prior to the analysis period. The Jensen-Shannon divergence between current and baseline distributions is computed for each feature. A divergence score exceeding configurable thresholds triggers a drift alert with the affected template ID, the diverging feature, and the magnitude of divergence.
Anomaly Detection Engine
The anomaly detection engine processes the prompt metadata stream in near-real-time (1-minute micro-batches). It applies four detection rules. First, unusually long prompts: if assembled prompt token count exceeds 3 standard deviations above the rolling mean for that template, the request is flagged. Second, PII detection: a synchronous PII scanner checks assembled prompts (or prompt hashes plus input-field metadata if full content logging is disabled) for PII patterns before they leave the application perimeter. Third, prompt injection signatures: a pattern matcher scans for known injection phrases (ignore previous instructions, you are now, act as, disregard your system prompt, etc.) and for instruction-boundary overrides. Fourth, suspicious structural patterns: prompts with unusual ratios of special characters, base64-encoded content, or role-alternation patterns that do not match the expected template structure.
Cost Anomaly Detection
Prompt token counts are correlated with cost data from the cost telemetry stream. The cost anomaly engine computes a rolling 7-day baseline for average prompt token count per template. If the 1-hour rolling average for any template increases by more than 50% above baseline, a cost anomaly alert is triggered. This catches scenarios where a prompt template change or data pipeline malfunction causes prompts to grow unexpectedly — a common cause of sudden 2–5x cost spikes.
Prompt Performance Analytics
Quality metrics are tracked per prompt template version: success rate (non-error completion), user satisfaction signal (thumbs up/down, task completion if measurable), hallucination rate from EAAPL-OBS003, and latency. When a new template version is deployed, a statistical comparison is automatically initiated between the outgoing and incoming versions using Mann-Whitney U test for latency and proportion z-test for success rate. If the incoming version is statistically significantly worse on any metric at p < 0.05, a deployment gate recommendation is raised.
5. Architecture Diagram
6. Components
| Component | Type | Responsibility | Technology Options | Criticality |
|---|---|---|---|---|
| PII Scanner | SDK Library | Scan prompt content for PII before logging; redact detected entities | Microsoft Presidio, AWS Comprehend (DetectPiiEntities), Google DLP, spaCy NER | Critical |
| Injection Pattern Matcher | SDK Library | Detect prompt injection signatures in real-time | Rule-based regex + embeddings similarity scorer; custom model fine-tuned on injection examples | Critical |
| Prompt Metadata Logger | SDK Library | Capture templateId, version, token counts, hash; emit to OTel pipeline | Custom wrapper on AI Client Wrapper from EAAPL-OBS001 | Critical |
| Prompt Template Registry | Service | Authoritative version-to-template mapping; deployment history | Git-backed service with API; Backstage plugin; custom service on PostgreSQL | High |
| Drift Detection Engine | Batch Job | Statistical comparison of current vs baseline prompt distributions | Python (scipy, numpy); PySpark for high volume; scheduled on Airflow/Prefect | High |
| Anomaly Detection Engine | Stream Processor | Near-real-time token length and pattern anomaly detection | Flink, Spark Streaming, AWS Kinesis Analytics | High |
| Cost Anomaly Engine | Stream Processor | Correlate prompt token counts with cost; detect cost spikes | Joins prompt metadata with cost telemetry; threshold-based alerting | Medium |
| Performance Comparator | Batch Job | Statistical A/B comparison of prompt versions on quality/latency metrics | Python (scipy stats); automated on every version deployment | High |
| Prompt Analytics Dashboard | UI | Traffic by version, quality trends, anomaly history | Grafana, Datadog, custom React dashboard | Medium |
| Alert Router | Integration | Route alerts to on-call and governance channels | PagerDuty, OpsGenie, Slack, Microsoft Teams | High |
7. Data Flow
Primary Flow
| Step | Actor | Action | Output |
|---|---|---|---|
| 1 | AI Client Wrapper | Assembles prompt from template + context + user input | Assembled prompt, templateId, templateVersion |
| 2 | PII Scanner | Scans assembled prompt synchronously for PII entities | Clean prompt (PII replaced with category tokens) or PII alert + redacted prompt |
| 3 | Injection Pattern Matcher | Scans assembled prompt for injection signatures | Clean signal or injection alert with matched pattern |
| 4 | Prompt Metadata Logger | Records templateId, templateVersion, inputTokens, promptHash, timestamp to log record | Structured log record with prompt metadata (no raw content unless approved) |
| 5 | OTel Collector | Receives log record; applies attribute enrichment; forwards to log backend | Enriched log record in storage |
| 6 | Drift Detection Engine | Runs batch analysis on 15-minute window; computes JS divergence vs baseline | Drift score per template; alert if threshold exceeded |
| 7 | Anomaly Detection Engine | Processes micro-batch; evaluates token length and structural anomalies | Anomaly flags on flagged requests; counters incremented |
| 8 | Performance Comparator | On new version deployment: runs statistical comparison vs previous version | A/B test result with p-value and recommendation |
| 9 | Alert Router | Receives alert events; routes to appropriate channel by severity and type | Notifications to PagerDuty, Slack, governance channels |
Error Flow
| Error Scenario | Detection | Action | Recovery |
|---|---|---|---|
| PII scanner unavailable | Health check failure; scanner timeout | Block prompt from being sent to model API; raise P1 alert | Fail closed: no prompt processed without PII scan; restore scanner service |
| Injection pattern DB out of date | Pattern match rate drops to zero for known test patterns | Alert to security team | Update pattern library; hotfix deployment |
| Drift detection job fails | Job completion metric absent; Airflow failure alert | Alert to platform engineering; previous baseline retained | Investigate job logs; re-run manually |
| Template version registry unavailable | API timeout from prompt metadata logger | Log requests with templateId=UNKNOWN; continue processing | Registry restoration; backfill missing version attribution |
| Cost anomaly false positive spike | Alert volume exceeds 20/hour | Suppress and escalate to AI engineering for threshold review | Adjust thresholds; add per-template baseline recalibration |
8. Security Considerations
Authentication: PII scanner and injection matcher services authenticate to the AI Client Wrapper via service-to-service mTLS. Prompt template registry access requires API key + role claim.
Authorisation: Access to prompt content logs (if enabled) requires data governance approval and is restricted to a named set of individuals. Bulk export requires CISO approval. Prompt analytics dashboards showing only aggregated metadata are accessible to AI engineering and product teams.
Secrets Management: Any model API keys or scanner API keys are stored in secrets manager, rotated quarterly. Scanner services running in-process with the AI Client Wrapper inherit the application's secret access; no additional secret scopes required.
Data Classification: Raw prompt content is classified as Confidential if it contains user-provided data. Prompt template text is classified as Internal. PII detected in prompts is classified as Sensitive — alert records are retained but the PII value is never stored, only the entity category and position.
Encryption: Prompt analytics data encrypted at rest (AES-256) and in transit (TLS 1.3). PII alert records stored in a high-security log store with additional access controls beyond the standard telemetry store.
Auditability: Every access to prompt content logs is itself audited. PII detection events are immutable and retained for the full regulatory retention period. Injection attempt logs are retained as security event records.
OWASP LLM Top 10 Coverage
| OWASP LLM Risk | Prompt Monitoring Control | Implementation |
|---|---|---|
| LLM01 Prompt Injection | Injection pattern matcher; structural anomaly detection | Alert on injection signatures within 60 seconds of detection |
| LLM02 Insecure Output Handling | Output monitoring feeds back to prompt analysis | Correlate injection detection with unusual output patterns |
| LLM03 Training Data Poisoning | Input distribution drift monitoring | Detect when prompt inputs shift toward adversarial patterns |
| LLM04 Model Denial of Service | Abnormally long prompt detection | Alert on prompts exceeding 3 sigma token count; rate limit enforcement |
| LLM05 Supply Chain Vulnerabilities | Prompt template version tracking | Detect unexpected template changes not matching deployment records |
| LLM06 Sensitive Information Disclosure | PII scanner before prompt leaves application boundary | Block or redact PII in prompts before reaching third-party model API |
| LLM07 Insecure Plugin Design | Tool call context in prompt metadata | Monitor tool-call instructions injected via prompts |
| LLM08 Excessive Agency | Detect prompts attempting to expand model scope | Alert on role-override patterns; monitor for capability escalation instructions |
| LLM09 Overreliance | Prompt quality analytics; version regression detection | Surface quality regressions before they cause downstream overreliance |
| LLM10 Model Theft | Monitor for prompt patterns designed to extract system prompts | Alert on meta-prompt patterns (tell me your instructions, repeat after me) |
9. Governance Considerations
Responsible AI: Prompt monitoring provides the evidence base for responsible AI review processes. Governance teams can audit which prompt versions were active during a specific period, what PII exposure events occurred, and whether injection attempts were detected and blocked.
Model Risk Management: Material prompt changes constitute model risk events. The prompt version registry and performance comparator provide the documentation and evidence required for model risk sign-off on prompt deployments.
Human Approval: Deployment of new prompt template versions to production requires approval from AI engineering lead for changes affecting > 10% of traffic. Changes to system prompts require AI governance committee approval.
Policy: Prompt content logging policy must be documented, approved by legal and privacy, and reviewed annually. The default is no prompt content logging; any deviation requires explicit approval with defined retention limits and access controls.
Traceability: Every PII detection event is traceable from the alert record to the prompt request (via requestId), to the user session (via hashed userId), to the data source that introduced the PII into the prompt context. This chain supports Privacy Act investigation obligations.
Governance Artefacts
| Artefact | Owner | Frequency | Format |
|---|---|---|---|
| Prompt Version Registry | AI Engineering | Continuous (per deployment) | Version-controlled database with API |
| PII Exposure Incident Log | Privacy / Data Governance | Per incident | Immutable event store record |
| Injection Attempt Report | Security | Weekly | Automated report: count, patterns, severity |
| Prompt A/B Performance Report | AI Engineering | Per version deployment | Automated statistical comparison document |
| Drift Alert History | AI Platform | Monthly review | Dashboard export + trend analysis |
| Prompt Content Logging Authorisation | Legal / Privacy | Annual | Signed policy document |
10. Operational Considerations
Monitoring: The PII scanner and injection matcher are in the critical inference path (synchronous). Their latency and availability must be monitored as first-class SLOs. If the PII scanner fails, the system must fail closed (not continue without scanning).
Logging: Monitoring system operational logs are stored separately from the AI audit logs they monitor, to prevent circular dependencies and to allow independent access control.
Incident Response: PII-in-prompt incidents are treated as data breach candidates and immediately escalate to the privacy officer. Injection attack incidents escalate to the security operations centre. Drift alerts escalate to AI engineering.
Disaster Recovery: PII scanner can run in degraded mode (regex-only, without NER model) during model service outage. This reduces detection accuracy but maintains baseline protection. Injection pattern matcher can fail open for availability (with alert) only if system is behind a WAF with injection rules.
Capacity Planning: PII scanner adds synchronous latency. Benchmarking required to establish acceptable throughput. At 1,000 requests/second, PII scanner must complete in < 10ms to avoid adding perceptible latency. Presidio with spaCy small model achieves 2–5ms for typical prompt lengths.
SLO Table
| SLO | Target | Measurement | Alert Threshold |
|---|---|---|---|
| PII scanner latency | < 10ms p99 | Instrumented scanner response time | > 20ms for 5 minutes |
| PII scanner availability | > 99.9% | Health check pass rate | < 99.5% for 5 minutes |
| Injection detection latency | < 5ms p99 | Pattern matcher response time | > 15ms for 5 minutes |
| Drift detection freshness | Runs within 20 minutes of schedule | Job completion timestamp | > 30 minutes behind schedule |
| Alert delivery from detection to notification | < 5 minutes | Alert timestamp vs. detection timestamp | > 10 minutes |
Disaster Recovery Table
| Component | RTO | RPO | Recovery Approach |
|---|---|---|---|
| PII Scanner | 2 minutes (fail closed) | N/A (stateless) | Auto-restart; fallback to regex-only mode |
| Injection Matcher | 5 minutes (fail open with alert) | N/A (stateless) | Auto-restart; WAF rules as backup |
| Drift Detection Engine | 60 minutes | Last batch | Restart job; run catch-up analysis |
| Prompt Registry | 15 minutes | 1 hour | Database restore; requests continue with UNKNOWN version tag |
| Alert Router | 5 minutes | Near-zero | Active-active; SMS fallback if primary down |
11. Cost Considerations
Cost Drivers
| Driver | Description | Relative Cost |
|---|---|---|
| PII scanner compute (synchronous) | NER model inference per request; scales linearly with request volume | High at large scale |
| Injection pattern matching | Regex fast; embedding similarity slower; regex is recommended for production | Low (regex) to High (embeddings) |
| Drift detection compute | Batch statistical computation; cost scales with data volume and feature count | Medium |
| Prompt analytics storage | Aggregated metadata (no content); much smaller than full log storage | Low |
| A/B comparison compute | Statistical tests on deployment events; infrequent | Low |
Scaling Risks: At very high request volumes (> 10K requests/second), synchronous PII scanning becomes a bottleneck. Mitigation: use streaming architecture where PII scanning happens asynchronously with a short (50ms) buffer before forwarding to model API; fail-closed if buffer not cleared.
Optimisations:
- Use regex-first PII scanning (fast) with NER model as fallback for regex-unconfident cases
- Cache injection pattern compilation (patterns are static; no runtime recompilation)
- Aggregate drift detection metrics at collector before storage; store distribution summaries not raw token counts
Indicative Cost Range
| Scale | Requests/Day | Estimated Prompt Monitoring Cost/Month |
|---|---|---|
| Small | 10,000 | $100–$300 |
| Medium | 500,000 | $800–$2,000 |
| Large | 5,000,000 | $3,000–$8,000 |
| Enterprise | 50,000,000+ | $15,000–$40,000 (with batched PII scanning) |
12. Trade-Off Analysis
Approach Comparison
| Approach | Pros | Cons | Best For |
|---|---|---|---|
| Synchronous PII scan + injection match in critical path | Fail-closed; guaranteed pre-delivery protection; no data escapes without scan | Adds latency (2–10ms); availability dependency | Regulated industries; customer-facing AI; any external model API |
| Asynchronous post-delivery analysis only | Zero latency impact; simpler architecture | PII already sent to model provider before detection; too late for injection blocking | Internal tools only; no external model API; low-risk use cases |
| Provider-side content filtering (e.g., Azure Content Safety, Bedrock Guardrails) | Managed service; no infrastructure overhead | Vendor lock-in; limited customisation; PII still traverses provider network; limited telemetry | Organisations without platform engineering capacity; greenfield deployments |
Architectural Tensions
| Tension | Description | Resolution |
|---|---|---|
| Safety vs. Latency | Synchronous scanning adds latency to every request | Use fast regex-first scanning; NER only for regex-uncertain cases; <10ms budget enforced by SLO |
| Privacy vs. Debuggability | Full prompt logging enables root-cause debugging but risks PII storage | Log prompt metadata only by default; content logging requires governance approval + PII scrubbing |
| Sensitivity vs. False Positives | Aggressive injection detection triggers false positives on legitimate complex prompts | Tiered detection: regex-flagged prompts reviewed by NLP classifier before alerting |
| Completeness vs. Cost | Monitoring every prompt provides full coverage but scales cost | Sample monitoring at 100% for anomaly detection; full analysis on flagged subset |
13. Failure Modes
| Failure | Likelihood | Impact | Detection | Recovery |
|---|---|---|---|---|
| PII scanner false negative (misses PII) | Medium | Critical (data breach) | Regular audit with labeled PII test set | Improve scanner; notify privacy officer of exposure risk |
| Injection attack evades pattern matcher | Medium | High (prompt manipulation) | Output monitoring; user reports | Update pattern library; add embedding-based detection |
| Drift detection baseline staleness | Medium | Medium (false drift alerts) | Alert volume spike; all templates flagged simultaneously | Recalibrate baseline; add seasonal adjustment |
| Template registry unavailable at deployment | Low | Medium (version attribution lost) | Deployment pipeline health check | Queue version registration; backfill when registry recovers |
| PII scanner causes 50ms+ latency spikes | Medium | High (user experience degradation) | p99 latency alert; scanner latency SLO breach | Switch to regex-only mode; alert platform engineering |
Cascading Scenarios
- Scenario 1: PII scanner disabled for maintenance → PII reaches external model API → Provider logs PII → Privacy Act breach notification required. Mitigation: no maintenance window without fail-closed alternative; scanner redundancy mandatory.
- Scenario 2: Injection attack evades detection → System prompt exfiltrated → Attacker crafts targeted follow-up attacks → Escalating security incident. Mitigation: monitor output for system prompt content; WAF rules as secondary control.
14. Regulatory Considerations
| Regulation | Clause | Requirement | Prompt Monitoring Implementation |
|---|---|---|---|
| Privacy Act 1988 (AU) | APP 11.1 (Security) | Personal information must not be disclosed to third parties without consent | PII scanner prevents PII reaching external model APIs; detection events logged |
| Privacy Act 1988 (AU) | APP 11.2 (Destruction) | PII no longer needed must be destroyed | Prompt metadata retained without PII content; destruction schedule enforced |
| APRA CPS 234 | Para 36 (Cyber Incident Response) | Security incidents (injection attacks) detected and reported within defined timeframes | Injection alerts within 60s; escalation to SOC per incident management runbook |
| EU AI Act | Article 12 (Record-keeping) | High-risk AI: inputs that led to a decision must be logged | promptTemplateId + templateVersion + requestId provides traceable record |
| EU AI Act | Article 9.5 (Risk Management) | Identify and analyse known risks of AI systems | Prompt injection classified as known risk; detection and response procedure documented |
| ISO/IEC 42001 | Clause 6.1.2 (AI Risk Assessment) | Risks from AI inputs must be assessed and treated | Prompt injection and PII risk documented; controls (scanner, matcher) implemented |
| NIST AI RMF | GOVERN 4.2, MAP 1.5 | Document and monitor AI-specific risks including adversarial inputs | Prompt monitoring directly addresses adversarial input risk mapping requirement |
15. Reference Implementations
AWS
- PII Scanner: Amazon Comprehend DetectPiiEntities API (synchronous, < 5ms for short prompts)
- Injection Matcher: Custom Lambda with regex + Amazon Bedrock Guardrails prompt attack detection
- Drift Detection: AWS Glue job with PySpark; scheduled via EventBridge
- Prompt Registry: DynamoDB table with version history; API Gateway + Lambda
- Analytics: CloudWatch Logs Insights; Amazon QuickSight dashboards
- Alerts: CloudWatch Alarms → SNS → PagerDuty
Azure
- PII Scanner: Azure AI Language PII Detection (Synchronous REST call)
- Injection Matcher: Azure Content Safety Prompt Shield (detects direct and indirect injection)
- Drift Detection: Azure Databricks job; scheduled via Azure Data Factory
- Prompt Registry: Azure Cosmos DB with change feed; Azure API Management
- Analytics: Azure Monitor Logs; Power BI dashboards
- Alerts: Azure Monitor Alerts → Action Groups → Teams / PagerDuty
GCP
- PII Scanner: Google Cloud DLP (Data Loss Prevention API) with synchronous content inspection
- Injection Matcher: Custom Cloud Function with Vertex AI Safety filters
- Drift Detection: BigQuery scheduled queries; Dataflow streaming job
- Prompt Registry: Firestore with version history; Cloud Endpoints
- Analytics: Looker dashboards; BigQuery for ad-hoc analysis
- Alerts: Cloud Monitoring Alerting → PagerDuty / Cloud Pub/Sub
On-Premises
- PII Scanner: Microsoft Presidio (open source, Python); deploy as sidecar service
- Injection Matcher: Custom rule engine with OWASP injection signature library
- Drift Detection: Apache Spark on Hadoop/Kubernetes; Airflow scheduling
- Prompt Registry: PostgreSQL with versioning schema; FastAPI service
- Analytics: Grafana dashboards against ClickHouse analytics store
- Alerts: Alertmanager → PagerDuty / Opsgenie / Email
16. Related Patterns
| Pattern ID | Pattern Name | Relationship | Notes |
|---|---|---|---|
| EAAPL-OBS001 | AI Telemetry Architecture | Foundation | Provides log ingestion pipeline; structured log schema required |
| EAAPL-OBS003 | Hallucination Detection | Sibling | Both are quality monitoring layers; hallucination detection uses output; this uses input |
| EAAPL-OBS004 | AI Incident Management | Depends On | Injection attacks and PII events feed into incident management lifecycle |
| EAAPL-OBS006 | AI Cost Observability | Sibling | Cost anomaly detection here (prompt token spikes); broader cost attribution in OBS006 |
| EAAPL-OBS007 | Distributed AI Tracing | Extends | Trace context from OBS007 enables linking prompt anomalies to full request traces |
17. Maturity Assessment
Overall Maturity: Proven
| Dimension | Score (1–5) | Rationale |
|---|---|---|
| Adoption Breadth | 3 | Adopted by security-conscious and regulated organisations; emerging in general market |
| Tooling Ecosystem | 4 | Presidio, AWS Comprehend, Azure Content Safety are mature; injection detection tooling improving rapidly |
| Operational Runbook Coverage | 3 | PII incident runbooks well-defined; injection attack runbooks organisation-specific |
| Regulatory Evidence | 4 | Privacy Act and APRA audit findings confirm necessity; EU AI Act requirements emerging |
| Cost Predictability | 4 | Cost scales predictably with request volume; PII scanner cost is well-characterised |
| Team Skill Availability | 3 | NLP/NER skills required for custom scanner tuning; regex-only implementations accessible to all teams |
18. Revision History
| Version | Date | Author | Changes |
|---|---|---|---|
| 1.0.0 | 2026-06-12 | EAAPL Working Group | Initial publication |