EAAPLEnterprise AI Architecture Pattern Library
EAAPLLibraryPlatform Engineering
Proven
⇄ Compare

EAAPL-PLT010 — AI Developer Portal Architecture

EAAPL-PLT010 — AI Developer Portal Architecture

Status: Proven
Tags: rbac audit-logging cost-optimisation llm medium-complexity
Version: 1.1
Last Updated: 2026-06-12
Author: Enterprise AI Architecture Pattern Library


1. Executive Summary

Enterprise AI capabilities—LLM APIs, embedding services, vector stores, AI agent frameworks, fine-tuning pipelines—are proliferating across organisations faster than platform teams can govern them. Without a centralised internal developer portal, AI capability access becomes a shadow IT problem: teams independently onboard to cloud AI APIs, use personal credit cards or shared API keys, apply inconsistent prompt safety guardrails, and generate costs and compliance exposures that are invisible to the platform and security teams.

The AI Developer Portal is an internal platform that provides product engineering teams with self-service access to the organisation's approved AI capabilities under governed, observable, and cost-attributed conditions. It provides: a searchable API catalogue of all approved AI services; a self-service model access request and approval workflow; per-team usage dashboards showing token consumption, costs, and error rates; a sandbox/playground environment for safe exploration without production impact; AI policy guardrail visibility; documentation standards for AI APIs; a developer onboarding flow; and golden path templates for common AI use cases. This pattern follows the platform engineering principle of reducing cognitive load for product teams while embedding non-negotiable governance controls in the platform itself.


2. Problem Statement

Business Problem

Without an AI developer portal, the typical enterprise AI landscape consists of: multiple teams independently subscribed to the same LLM provider; no consolidated view of AI spend or usage patterns; inconsistent security and compliance practices across teams; duplicated AI infrastructure (each team builds their own prompt management, caching, and monitoring); no mechanism to enforce AI governance policies; and no self-service path for new teams to adopt AI, leading to delays as each team navigates vendor onboarding independently.

Technical Problem

Developers need to discover what AI capabilities are available, understand their constraints (rate limits, token limits, pricing, data handling obligations), experiment safely before committing to production, and access APIs through a consistent, observable path. None of these needs are met by direct LLM vendor API access with individual API keys. Direct API access also bypasses organisational controls: prompt injection guardrails, PII redaction, audit logging, cost attribution, and regulatory compliance middleware are all skipped.

Symptoms

  • Three different teams are paying for separate OpenAI API subscriptions with no consolidated volume discount
  • A developer used a personal credit card to access a new LLM because the procurement process takes 6 weeks
  • A production AI feature was deployed with an API key committed to a public GitHub repository
  • The security team cannot identify which AI APIs are in use across the organisation
  • A regulatory audit finds that some AI API calls contain PII that should have been redacted

Cost of Inaction

Dimension Consequence
Financial Shadow AI spend uncounted; missed volume discounts; wasted duplicate infrastructure
Security Ungoverned API keys; PII in AI calls; prompt injection vulnerabilities
Compliance AI usage without governance review; PII outside approved processing boundaries
Productivity Each team spends 2–6 weeks setting up AI infrastructure that a portal would provide in 1 day

3. Context

When to Apply

  • Enterprises with 3+ product teams seeking to use AI capabilities
  • Organisations with AI governance requirements (regulated industries, government, large enterprise)
  • Platforms needing to enforce consistent AI policies (prompt safety, PII redaction, data residency) across all AI usage
  • Organisations seeking to consolidate AI vendor relationships for cost efficiency

When NOT to Apply

  • Single-team organisations or early-stage startups where overhead of a portal is not justified
  • Research environments where governance constraints would impede academic freedom (lighter-weight controls suffice)
  • Organisations with a single, narrowly scoped AI use case

Prerequisites

Prerequisite Description
Identity and Access Management SSO/LDAP/AD integration for developer identity; RBAC groups
AI Governance Policy Documented policy defining which AI services are approved, what data classifications are permitted, and what guardrails are required
AI Budget Consolidated AI API budget allocated to the platform for redistribution to teams
Platform Engineering Team Team capable of building and operating the portal infrastructure
API Catalogue Seed Initial list of approved AI APIs and their constraints

Industry Applicability

Industry Portal Priority Key Requirements
Financial Services High — strict governance required PII guardrails; data residency; compliance review workflow; cost chargeback
Healthcare High — PHI handling controls HIPAA-compliant AI paths; strict data classification enforcement
Government High — ISM/ASD alignment Approved cloud services only; classification-based access control
Technology / SaaS Medium-High — rapid team scaling Fast self-service; golden path templates; sandbox first
Media / Publishing Medium Content policy guardrails; cost attribution by product
Retail / E-commerce Medium Personalisation AI; recommendation AI; cost per campaign attribution

4. Architecture Overview

The AI Developer Portal is structured as a platform that wraps enterprise AI capabilities with governance, observability, and developer experience layers. It follows the internal developer portal pattern (Backstage, Port, Cortex) extended with AI-specific capabilities.

Portal Layer 1 — AI API Catalogue The AI API Catalogue is the discovery layer. It lists every AI capability approved for enterprise use, organised by category: foundation models (GPT-4o, Claude Sonnet, Gemini Pro), embedding models, image generation, speech-to-text, code generation, AI agent frameworks, vector store services, and MLOps platform capabilities. Each catalogue entry provides: a human-readable description of what the capability does; the data classification tiers it is approved to process (e.g., Public and Internal, not Confidential); applicable guardrails that are pre-configured; rate limits and token limits; pricing per unit of consumption; example API calls with annotated code snippets; OpenAPI specification; and a "Request Access" button that triggers the self-service access workflow. The catalogue is searchable by capability type, approved data classification, pricing, and guardrail compatibility.

Portal Layer 2 — Self-Service Access Request and Approval Workflow Teams access AI capabilities through a self-service workflow, not through direct vendor API onboarding. The developer completes a brief access request form: team name, use case description, data classification of inputs, expected monthly volume, and compliance considerations. The workflow routes the request for automated or human review based on the capability's risk tier: low-risk capabilities (public data, well-established models, no PII) are auto-approved and credentials issued immediately; medium-risk capabilities require team lead acknowledgment of usage policy; high-risk capabilities (sensitive data, experimental models, agentic capabilities) require platform security team review, which is conducted within 2 business days. On approval, the team receives: a team-scoped API key (with usage tracked to their team); access to the sandbox environment for that capability; usage documentation; and a link to the relevant golden path template.

Portal Layer 3 — Per-Team Usage Dashboards Every approved team has a dedicated usage dashboard showing: monthly token consumption by model; cost breakdown by model and use case; error rates by endpoint; request latency percentiles; budget utilisation vs monthly allocation; quota utilisation vs team limit; and a 90-day trend. The dashboard is accessible by team members and their managers. The platform team has a cross-team view. Finance has a read-only cost-attribution view. Usage data is updated in near-real-time (latency: <2 minutes from API call to dashboard).

Portal Layer 4 — Sandbox/Playground Environment The Sandbox is a production-isolated environment where developers can explore AI capabilities without consuming production quota, incurring production-attributed costs, or risking production data exposure. The Sandbox provides: a prompt playground UI for interactive LLM experimentation; a pre-configured set of example prompts per capability; a mock tool execution environment for testing agent workflows; real-time token counter showing consumption (deducted from a separate sandbox budget, not team production quota); and complete isolation from production: no production data is accessible in the sandbox, and sandbox API calls are never logged with the same identifiers as production calls. The Sandbox is the mandated first step for any developer new to an AI capability.

Portal Layer 5 — AI Policy Guardrails Visibility Every AI capability in the catalogue has a Guardrails Panel that shows the developer which controls are automatically applied to their API calls through the portal proxy layer: PII redaction status (on/off; which categories are redacted); prompt injection detection (on/off; sensitivity level); output content filtering (which content categories are filtered); data residency enforcement (which regions the data may be processed in); audit logging (all calls are logged; what is retained; retention period). Teams cannot disable guardrails; they can only see what is applied. Where a team needs a guardrail configuration not available in the standard catalogue, they can submit a Guardrail Exception Request that goes through the platform security team. This creates a visible and auditable exception path rather than a shadow bypass.

Portal Layer 6 — Documentation, Golden Paths, and Onboarding AI APIs in the portal are documented to a higher standard than raw vendor documentation. Each capability has: an AI-specific OpenAPI extension spec that includes: model name and version, token limits (input and output), pricing per token, data handling declaration, guardrails applied, and known limitations. Golden Path templates—pre-built code patterns for common AI use cases (RAG search, document summarisation, customer support chatbot, code review, structured extraction)—are provided in Python, TypeScript, and Java. The developer onboarding flow is: (1) explore catalogue and sandbox; (2) submit access request; (3) receive credentials and golden path template; (4) set up usage monitoring; (5) deploy to staging with portal-proxied API calls. This flow is designed to take < 1 day end-to-end for low-risk capabilities.


5. Architecture Diagram

ARCHITECTURE DIAGRAM
flowchart TD subgraph Developers["Developers and Teams"] DEV[Developer] TEAMLEAD[Team Lead / Manager] FINANCE[Finance / BI] end subgraph Portal["AI Developer Portal"] CATALOGUE[AI API Catalogue\nSearch + Browse + Filter] REQFLOW[Access Request Workflow\nAuto / Team Lead / Security Review] SANDBOX[Sandbox / Playground\nIsolated from Production] DOCS[Documentation Hub\nOpenAPI + AI Extensions + Golden Paths] GUARDRAILS_VIS[Guardrails Visibility Panel\nPII / Injection / Content / Residency] end subgraph ProxyLayer["AI Gateway / Proxy Layer"] AUTHZ[Auth + RBAC Check\nTeam-Scoped Key Validation] GUARDRAILS_APPLY[Guardrail Middleware\nPII Redact + Injection Detect] RATELIMIT[Rate Limit + Quota\nPer-Team Enforcement] COSTTRACK[Cost Tracker\nToken + Call Attribution] AUDITLOG[Audit Logger\nImmutable Per-Call Record] end subgraph AI_Services["Approved AI Services"] LLM1[OpenAI / Azure OpenAI\nGPT-4o / GPT-4o-mini] LLM2[Anthropic\nClaude Sonnet / Haiku] LLM3[Google Vertex AI\nGemini Pro] EMBED[Embedding Services] AGENT[Agent Frameworks] end subgraph Observability["Observability and Governance"] DASHBOARD[Per-Team Usage Dashboard\nTokens / Cost / Errors / Latency] COSTATTR[Cost Attribution Engine\nBU / Team / Use Case] ALERT[Budget + Quota Alerts\n50% / 80% / 100%] APPROVAL_SVC[Approval Engine\nAuto / Human Review] end DEV --> CATALOGUE DEV --> SANDBOX DEV --> DOCS CATALOGUE --> REQFLOW REQFLOW --> APPROVAL_SVC APPROVAL_SVC -->|Approved| DEV DEV -->|API Call via Portal| AUTHZ AUTHZ --> GUARDRAILS_APPLY GUARDRAILS_APPLY --> RATELIMIT RATELIMIT --> COSTTRACK COSTTRACK --> AUDITLOG AUDITLOG --> LLM1 AUDITLOG --> LLM2 AUDITLOG --> LLM3 AUDITLOG --> EMBED AUDITLOG --> AGENT COSTTRACK --> DASHBOARD COSTTRACK --> COSTATTR COSTATTR --> ALERT TEAMLEAD --> DASHBOARD FINANCE --> COSTATTR DEV --> GUARDRAILS_VIS

6. Components

Component Type Responsibility Technology Options Criticality
AI API Catalogue UI + Database List approved AI services; search; access request trigger Backstage (open source); Port; Cortex; custom React + PostgreSQL High
Access Request Workflow Engine Workflow Route access requests; auto-approve or human-review; issue credentials Jira Service Management; ServiceNow; custom Temporal workflow High
Sandbox / Playground UI + Infrastructure Isolated interactive AI experimentation environment Custom React UI + isolated API proxy; PromptLayer; LangSmith High
AI Gateway / Proxy Infrastructure Single entry point for all AI API calls; enforces controls Kong; Apigee; AWS API Gateway; LiteLLM Proxy; custom Critical
Guardrail Middleware Processing PII redaction, prompt injection detection, content filtering on all API calls Microsoft Presidio; NeMo Guardrails; AWS Comprehend; custom Critical
Rate Limit and Quota Engine Processing Enforce per-team token and call limits; prevent quota exhaustion Redis + sliding window algorithm; Kong rate limiting plugin High
Cost Tracker + Attribution Engine Analytics Per-call cost metering; attribute to team/user/use-case Custom Kafka consumer + DynamoDB; ClickHouse; BigQuery High
Audit Logger Security Immutable per-call audit record: timestamp, team, model, token counts, guardrail actions S3 + Object Lock; Azure Immutable Blob; ClickHouse Critical
Per-Team Usage Dashboard Reporting Self-service usage and cost visibility for teams Grafana + API datasource; Retool; custom React + Chart.js High
Documentation Hub Content AI-extended OpenAPI specs; golden path templates; onboarding guides Backstage TechDocs; GitBook; Confluence High
Credential Manager Security Issue, rotate, and revoke team-scoped API keys HashiCorp Vault; AWS Secrets Manager; Azure Key Vault Critical
Budget and Quota Alert System Operations Notify team leads and platform on threshold breach Email + Slack; PagerDuty; SNS + SES Medium
Approval Engine Governance Auto-approve or route for human review based on capability risk tier Custom rule engine; Jira workflow; ServiceNow High

7. Data Flow

Primary Flow (Developer Onboarding and First API Call)

Step Actor Action Output
1 Developer Browse AI API Catalogue; identify capability needed Selected capability from catalogue
2 Developer Explore capability in Sandbox/Playground Confidence in capability suitability
3 Developer Submit access request: team, use case, data classification Access request record in workflow engine
4 Approval Engine Evaluate risk tier; auto-approve or route for review Approval decision within configured SLA
5 Credential Manager Issue team-scoped API key with configured limits API key delivered to developer via secure channel
6 Developer Configure application to call AI APIs through portal proxy (not direct vendor) Application configured with portal endpoint + team API key
7 AI Gateway Receive API call; validate team API key; check RBAC Authenticated and authorised request
8 Guardrail Middleware Apply PII redaction; prompt injection scan; content filter Sanitised request ready for model
9 Rate Limit Engine Check team's remaining quota; decrement counter Permitted or rate-limited response
10 AI Vendor API Execute inference; return response Raw model response
11 Cost Tracker Record tokens consumed; attribute to team; update running total Cost record attributed to team
12 Audit Logger Write immutable call record: team, model, tokens, guardrail actions, latency Audit log entry
13 Usage Dashboard Update near-real-time dashboard with this call's data Dashboard updated within 2 minutes

Error Flow

Step Failure Detection Recovery
Guardrail Middleware Outage Calls not filtered; PII may reach vendor API Health check on middleware; portal canary test Block all API calls until middleware restored; alert security team
AI Vendor API Unavailable Teams cannot reach approved AI capability Health check; circuit breaker in proxy Serve cached unavailability notice; route to alternate approved model if configured
Team Quota Exhausted Rate limit enforced; developer's calls rejected Rate limit metric; developer receives 429 Developer notified; team lead can request quota increase via self-service
Audit Log Pipeline Failure Calls not being logged; compliance gap Log pipeline health check Block API calls until logging restored (fail-safe); alert compliance team
Credential Compromise Team API key found in public repository GitHub secret scanning alert; anomaly in usage patterns Immediately rotate key; audit calls made with compromised key; notify team

8. Security Considerations

Portal Security Controls

Domain Control Implementation Notes
Authentication SSO with MFA required to access portal and request access; API keys are team-scoped, not personal SAML/OIDC SSO; MFA enforced Prevents shared or personal credential use
Authorisation RBAC: Developer (read catalogue, submit requests, view own team's dashboard); Team Lead (approve team requests, view team dashboard); Platform Admin (all); Finance (cost reports only) RBAC in portal application; portal proxy validates team API key against capability permissions
Secrets Team API keys stored in encrypted credential manager; never shown in UI after initial issuance; rotated every 90 days HashiCorp Vault; AWS Secrets Manager Rotation prevents long-lived key exposure
Classification Catalogue entries tagged with maximum approved data classification; proxy enforces classification — calls containing Confidential data to Public-data-only APIs are blocked Data classification middleware in proxy
Encryption All portal traffic TLS 1.3; audit logs encrypted at rest with CMEK; API keys encrypted in credential manager Cloud-native TLS; CMEK
Auditability Every access request, approval, credential issuance, rotation, and revocation is logged immutably alongside per-call API audit records Append-only audit tables; S3 Object Lock for API call logs

OWASP LLM Top 10 — Portal Controls

OWASP LLM Risk Portal Control Implementation
LLM01 Prompt Injection Prompt injection detection applied to all API calls through the portal proxy NeMo Guardrails; custom pattern matcher; block or flag high-confidence injections
LLM02 Insecure Output Handling Output validation middleware strips executable content before returning to caller Output schema validation; content type enforcement
LLM03 Training Data Poisoning Not applicable to inference portal; addressed in training pipeline pattern Portal is inference-only; training pipelines are separate
LLM04 Model Denial of Service Per-team rate limiting and quota prevents any single team from exhausting shared capacity Sliding window rate limiter; per-team token budget
LLM05 Supply Chain Vulnerabilities Only approved AI vendors in catalogue; all vendor integrations security-reviewed Vendor approval process; DPA in place for all catalogue entries
LLM06 Sensitive Information Disclosure PII redaction applied to all API calls before forwarding to vendor Microsoft Presidio; AWS Comprehend; configurable per data classification
LLM07 Insecure Plugin Design Agent framework capabilities reviewed before catalogue listing; tool permissions documented Tool permission documentation required in catalogue entry; excessive-permission tools rejected
LLM08 Excessive Agency Agentic capabilities in catalogue have mandatory cost ceiling documentation; human oversight guardrails documented Catalogue entry requires cost ceiling + human oversight method for agent capabilities
LLM09 Overreliance Catalogue entries include known limitations and recommended human review guidance Mandatory "Limitations and Caveats" section in every catalogue entry
LLM10 Model Theft Portal proxy does not expose model weights or architecture; only inference results Proxy design: forward only inference requests; no model download capability

9. Governance Considerations

Portal Governance

Domain Requirement Owner Cadence
Catalogue Currency All catalogue entries reviewed for accuracy; deprecated capabilities removed Platform Engineering Quarterly
Access Request SLAs Low-risk: auto-approve; medium: 24h; high: 48h Platform Security Per-request
Guardrail Policy Guardrail configurations reviewed and updated as new attack vectors emerge Platform Security + AI Governance Quarterly + on incident
Budget Allocation Team AI budgets reviewed and adjusted Finance + BU heads Quarterly
Audit Log Review Audit logs reviewed for anomalous patterns; exported for compliance Platform Security Monthly
Golden Path Currency Templates tested against current API versions; updated when breaking changes occur Platform Engineering On model version change

Governance Artefacts

Artefact Description Retention
AI API Catalogue Version History Record of when capabilities were added, modified, or retired Permanent
Access Request and Approval Records All access requests with justification and approval decision 7 years
Team API Key Issuance and Rotation Log When keys were issued, to whom, rotated, or revoked 7 years
Per-Call Audit Logs Immutable record of every AI API call through the portal 7 years
Monthly Cost Attribution Reports Per-team, per-capability cost summaries for chargeback 7 years
Guardrail Exception Requests Approved exceptions to standard guardrail configuration 5 years

10. Operational Considerations

Monitoring and SLOs

SLO Target Measurement Breach Action
Portal Gateway Availability 99.9% per month Synthetic probes every 60 seconds P1 incident; investigate; notify all teams
Access Request Processing Low-risk: <15 minutes; medium: <24h; high: <48h Request-to-approval duration SLA breach alert to platform team; manual escalation
Guardrail Middleware Latency <100ms added to 99th percentile API call P99 latency of guardrail processing Investigate; scale horizontally; async mode for non-blocking guardrails
Dashboard Data Freshness <2 minutes from API call to dashboard Data pipeline lag metric Alert data engineering; manual cache refresh
Audit Log Integrity 100% of API calls have corresponding audit log entry Reconciliation: API call count vs log entry count Immediately investigate; may be compliance-reportable gap

Capacity Planning

The portal proxy adds 50–150ms to API call latency for the guardrail layer. This is acceptable for most AI use cases (human-facing chatbots: <300ms budget; batch processing: latency-insensitive). For latency-critical applications (<100ms budget), an accelerated path with lightweight guardrails may be needed. Portal proxy should be horizontally scalable behind a load balancer; auto-scaling should be triggered at 60% CPU utilisation.

Disaster Recovery

Scenario Impact Recovery
Portal Proxy Outage All AI API calls fail through portal Direct vendor access (break-glass credentials) available for P0 production incidents; restore portal within 4 hours
Catalogue Database Failure Cannot browse or request new capabilities Restore from backup within 1 hour; existing team credentials unaffected
Audit Log Store Unavailable Compliance gap during outage Block new API calls (fail-safe); restore audit store; reconcile calls during gap from proxy access logs

11. Cost Considerations

Cost Drivers

Cost Driver Indicative Cost Notes
Portal Proxy Infrastructure USD 2,000–20,000/month Scales with request volume; Kong / Apigee licensing or cloud-native API GW
Guardrail Middleware USD 1,000–10,000/month PII detection cost scales with text volume
Usage Dashboard Infrastructure USD 500–5,000/month Grafana + data store; or Retool licensing
Catalogue and Portal Application USD 1,000–5,000/month Hosting; Backstage or Port licensing
Platform Engineering FTE USD 300,000–600,000/year 1–2 FTE to build and maintain
Audit Log Storage USD 200–2,000/month Scales with call volume; Object Lock WORM storage

AI Spend Governance Value (Cost Savings Through Portal)

Benefit Estimated Value
Volume discount through consolidated API keys 10–30% reduction in per-token pricing
Elimination of shadow AI spend 20–40% reduction in untracked spend
Model tier routing (routing simple tasks to cheaper models) 30–60% reduction in per-task model cost
Duplicate infrastructure elimination USD 50,000–200,000/year in avoided team-level AI infrastructure spend

Indicative Implementation Cost Range

Organisation Annual Portal Cost Notes
Small (3–10 teams, <1M API calls/month) USD 200,000–500,000 Lightweight stack; Backstage + Kong
Mid-size (10–50 teams, 1M–50M calls/month) USD 500,000–1,500,000 Full stack; dedicated platform team
Large enterprise (50+ teams, >50M calls/month) USD 1,500,000–5,000,000 Enterprise licensing; large portal team

12. Trade-Off Analysis

Architecture Options

Option Description Pros Cons Recommended For
Option A: Build on Backstage + LiteLLM Proxy Open-source Backstage for portal UI + LiteLLM Proxy for AI gateway Lowest licensing cost; full customisation; strong community Requires platform engineering investment; ongoing maintenance Organisations with strong platform engineering capability
Option B: Commercial Internal Developer Portal (Port, Cortex) + Cloud API GW Commercial IDP platform + cloud-native API gateway Faster time to value; managed maintenance; enterprise support Higher licensing cost; less flexibility Organisations with limited platform engineering capacity
Option C: Cloud Provider AI Platform (Azure AI Studio, AWS Bedrock console) Use cloud provider's native AI portal capabilities Seamlessly integrated with provider ecosystem; no build cost Vendor lock-in; limited customisation; single-cloud only Organisations already deeply committed to a single cloud provider

Architectural Tensions

Tension Trade-Off Resolution
Governance vs Developer Velocity Strong controls (approval workflows, guardrails) slow AI adoption Auto-approve low-risk capabilities; guardrails are invisible (applied in proxy, not visible to developer)
Centralised Platform vs Team Autonomy Centralised portal removes team control over AI infrastructure Teams retain control over prompts, use cases, and integration; portal controls only what crosses governance boundary
Completeness vs Time to Value Comprehensive portal takes 6–12 months to build; teams need AI now Phase 1 (8 weeks): catalogue + access request + proxy + audit log; Phase 2: sandbox + dashboard; Phase 3: golden paths + advanced features
Proxy Latency vs Control Every additional middleware layer adds latency Profile guardrail latency; async processing for non-blocking guardrails; hardware acceleration for high-volume paths

13. Failure Modes

Failure Likelihood Impact Detection Recovery
Portal Becomes Shadow IT Bypass Target High High — teams route around portal to direct vendor APIs API call origin monitoring; vendor invoice vs portal call count mismatch Enforce portal use via network egress rules; no direct vendor access from production networks
Guardrail False Positives Block Legitimate Calls Medium Medium — developer productivity impact; trust in portal erodes Developer feedback; error rate spike on guardrail decision Tune guardrail sensitivity; add team-specific exception with audit
Catalogue Staleness High Medium — developers use outdated API specs; misconfigurations Version mismatch alerts; developer-reported errors Implement automated API spec refresh from vendor APIs; quarterly manual review
Audit Log Gap Low Critical — compliance exposure; cannot demonstrate what AI calls were made Log pipeline monitoring; reconciliation check Fail-safe: block API calls if audit log unavailable
Access Request Bottleneck Medium Medium — 48h SLA for high-risk capabilities delays AI adoption Request backlog metric; SLA breach rate Add reviewers; pre-approve common high-risk patterns; escalate to platform leadership
Portal Single Point of Failure Low Critical — all AI calls fail if proxy is down Availability monitoring; synthetic probes Multi-AZ deployment; auto-scaling; break-glass direct access for P0 production

Cascading Failure Scenario

The AI Developer Portal is deployed with a proxy that adds PII redaction and audit logging but does not have multi-AZ redundancy. A database maintenance window causes the audit log store to be unavailable for 45 minutes. The proxy is configured to fail-open (allow calls even when audit logging is unavailable) to prevent developer disruption. During the 45-minute window, 50,000 API calls are made without audit logging. A subsequent compliance audit finds the logging gap. Because the portal was logging all team API keys, the 45-minute window means investigators cannot fully reconstruct what calls were made. A data subject's Subject Access Request cannot be fully satisfied because some AI calls made during this period are unknown. The GDPR Article 30 record of processing is incomplete for this period. Remediation: configure fail-safe (block calls if audit unavailable); add multi-AZ audit log with write-ahead buffer.


14. Regulatory Considerations

Regulation Portal Relevance Portal Control Reference
GDPR Article 30 — Records of Processing Every AI API call via portal creates a processing record Audit logger generates records for Article 30 compliance GDPR Article 30
GDPR Article 25 — Privacy by Design Guardrails (PII redaction) embedded in portal proxy Privacy-by-default: PII redaction on by default for all data classifications GDPR Article 25
Privacy Act APP 11 — Security Portal enforces security controls across all AI API usage Auth, RBAC, audit logging, guardrails APP 11
APRA CPS234 ¶17 — Controls Portal implements preventive and detective controls for all AI API use Auth + guardrails + audit = preventive + detective CPS234 Paragraph 17
EU AI Act — Transparency and Documentation Catalogue entries include transparency information and limitations Mandatory "Limitations and Caveats" section; data handling declaration EU AI Act Article 13
ISO 42001 Clause 7 — Support Portal provides the awareness and documentation support required by Clause 7 Documentation Hub + Golden Paths serve ISO 42001 awareness obligation ISO/IEC 42001 Clause 7.2–7.3
SOX / Financial Controls AI spend attribution for financial services companies; audit trail Cost Attribution Engine + immutable audit logs SOX Section 302
NIST AI RMF GOVERN 1.3 — Policies Portal enforces AI usage policies across all teams Guardrails applied uniformly; no bypasses; policy visible in catalogue NIST AI RMF GOVERN 1.3

15. Reference Implementations

AWS

Component AWS Service / Tool
AI API Catalogue AWS Service Catalog + Backstage (EC2/ECS hosted)
AI Gateway / Proxy Amazon API Gateway + AWS Lambda (guardrail middleware)
Guardrail Middleware Amazon Comprehend (PII) + custom Lambda
Rate Limiting API Gateway usage plans + Lambda token counter in DynamoDB
Cost Tracker Kinesis Data Streams + Lambda consumer + DynamoDB cost store
Audit Logger CloudTrail + S3 Object Lock
Usage Dashboard Amazon QuickSight; or Grafana on EC2
Credential Manager AWS Secrets Manager
Sandbox Playground Custom React app on AWS Amplify + isolated API Gateway
Documentation Backstage TechDocs on S3 + CloudFront

Azure

Component Azure Service / Tool
AI API Catalogue Azure Developer Portal (APIM) + custom catalogue extension
AI Gateway / Proxy Azure API Management (built-in gateway)
Guardrail Middleware Azure AI Language (PII) + APIM policy
Rate Limiting APIM built-in rate limiting policies
Cost Tracker Azure Event Hubs + Azure Function + Cosmos DB
Audit Logger Azure Monitor + Immutable Blob Storage
Usage Dashboard Power BI + Azure Monitor
Credential Manager Azure Key Vault + Managed Identities
Sandbox Playground Custom app on Azure Static Web Apps + separate APIM instance
Documentation Azure DevOps Wiki + APIM developer portal

GCP

Component GCP Service / Tool
AI API Catalogue Apigee Developer Portal; or Backstage on GKE
AI Gateway / Proxy Apigee API Management
Guardrail Middleware Cloud DLP + Cloud Endpoints
Rate Limiting Apigee quota policies
Cost Tracker Cloud Pub/Sub + Cloud Functions + BigQuery
Audit Logger Cloud Audit Logs + Cloud Storage Bucket Lock
Usage Dashboard Looker + BigQuery
Credential Manager Secret Manager
Sandbox Playground Custom app on Cloud Run + separate Apigee environment

On-Premises / Self-Hosted

Component Technology
AI API Catalogue Backstage (open source, self-hosted)
AI Gateway / Proxy Kong Gateway (open source) + custom plugins
Guardrail Middleware Microsoft Presidio (open source); NeMo Guardrails
Rate Limiting Kong rate limiting plugin + Redis
Cost Tracker Apache Kafka + Flink + PostgreSQL
Audit Logger Splunk Enterprise + WORM storage
Usage Dashboard Grafana + InfluxDB or PostgreSQL
Credential Manager HashiCorp Vault
Sandbox Playground Custom React app + isolated Kong environment
Documentation Backstage TechDocs + GitBook

Pattern ID Pattern Name Relationship Notes
EAAPL-AGT010 AI Agent Cost Governance COMPLEMENTARY Portal provides cost visibility and budget alerts; Agent Cost Governance provides per-execution controls for agentic workloads
EAAPL-CMP004 Privacy-Preserving AI COMPLEMENTARY Portal's PII guardrail middleware operationalises privacy-preserving controls across all teams automatically
EAAPL-CMP007 Data Residency for AI COMPLEMENTARY Portal proxy enforces data residency routing; teams see which residency rules apply to their approved capabilities
EAAPL-PLT007 AI Observability Platform PREREQUISITE Portal's usage dashboards consume metrics from the AI observability platform
EAAPL-SEC001 Zero-Trust Architecture PREREQUISITE Portal API keys are verified through zero-trust identity infrastructure
EAAPL-CMP002 APRA CPS234 AI Security COMPLEMENTARY Portal's guardrails and audit logging satisfy CPS234 ¶17 detective and preventive control requirements for AI

17. Maturity Assessment

Overall Maturity Label: Proven

Dimension Level 1 Level 2 Level 3 Level 4 Level 5 Current Level
API Catalogue No catalogue Informal list in wiki Searchable catalogue with AI-extended specs Catalogue integrated with CMDB; auto-updated Catalogue self-populating from AI vendor APIs Level 3
Access Governance No process Email request Workflow with auto-approve / human review SLA-tracked; exception management AI-assisted request routing and risk assessment Level 3
Guardrails No guardrails Manual code reviews Portal proxy applies guardrails to all calls Guardrails configurable per team classification Adaptive guardrails based on real-time threat intelligence Level 3
Usage Observability No visibility Monthly billing reports Near-real-time per-team dashboards Anomaly detection; proactive budget alerts Predictive capacity and spend forecasting Level 3
Developer Experience Direct vendor onboarding (weeks) Basic portal (days) Golden path templates; sandbox; <1 day onboarding AI assistant for prompt development in portal Conversational portal with AI-powered capability recommendation Level 3

18. Revision History

Version Date Author Changes
1.0 2025-08-15 EAAPL Working Group Initial draft
1.1 2026-06-12 EAAPL Working Group Added cascading failure scenario; expanded reference implementations; added regulatory considerations for ISO 42001 and NIST AI RMF alignment
← Back to LibraryMore Platform Engineering