Proven

EAAPL-OBS006 · AI Cost Observability

Pattern ID: EAAPL-OBS006 Status: Proven Complexity: Medium Tags: cost-optimisation observability alerting llm medium-complexity Version: 1.0.0 Last Reviewed: 2026-06-12

1. Executive Summary

AI inference costs have a fundamentally different cost structure than traditional compute: they are token-based, per-request, and vary by model and usage pattern. Without purpose-built cost observability, AI spend is invisible until the monthly cloud bill arrives — at which point the overspend has already occurred and attribution to responsible teams is guesswork. Organisations routinely discover AI cost overruns of 200–500% when moving from proof-of-concept to production, driven by longer-than-expected prompts, unexpected usage volumes, or uncontrolled model selection.

This pattern defines full-stack AI cost observability: per-request cost tagging so every inference call is attributed to a team, product, feature, user tier, and model; cost allocation dashboards by every meaningful dimension; budget alerts at team, product, and organisation levels; anomaly detection on cost spikes; cost-per-outcome metrics that connect AI spend to business value; monthly attribution reports for FinOps and executive reporting; unified cloud cost and AI API cost views; and an optimisation recommendation engine that automatically identifies the top three cost reduction opportunities. The outcome is a FinOps-grade cost management capability for AI, comparable to what mature organisations have built for cloud infrastructure.

Target Audience: CIO, CTO, CFO, Head of FinOps, AI Engineering Lead Time to Implement: 4–8 weeks

2. Problem Statement

Business Problem

AI inference costs can be 10–100x the cost of traditional API calls. A single GPT-4-class LLM call processing a long document can cost $0.10–$1.00. At scale, these costs accumulate rapidly. Yet most organisations cannot answer: Which team is responsible for 40% of our AI spend? Is our cost per resolved customer ticket increasing or decreasing? Which product feature has the worst cost-to-value ratio? Without this visibility, AI investment decisions are made on intuition rather than evidence.

Technical Problem

AI API costs flow through a small number of API keys and appear as aggregate line items in cloud billing. Request-level cost is not native to most AI service invoices; it must be calculated from token counts logged at inference time using a model-specific cost coefficient table. This calculation must happen within the application, not from the billing system, because billing data arrives 24–48 hours late and lacks the context dimensions needed for attribution.

Symptoms

Monthly cloud bill has a large "AI/ML" line item with no sub-attribution
Teams cannot self-serve their AI cost data; FinOps team manually allocates costs quarterly
A new AI feature shipped without a cost cap; usage exceeded budget by 400% in the first week
Cost optimisation conversations start with "which model should we use?" but there is no data to inform the decision
Business stakeholders cannot relate AI spend to business outcomes; AI ROI questions are unanswerable

Cost of Inaction

AI cost overruns are a leading reason for AI project cancellation or de-scoping
Without per-team cost visibility, engineering teams have no incentive to optimise AI usage
Missed opportunity to identify and implement optimisations that typically reduce AI costs by 30–60%
Inability to demonstrate AI ROI to executive sponsors undermines AI investment justification

3. Context

When to Apply

Any organisation with AI API spend exceeding $1,000/month
Organisations with multiple teams sharing AI infrastructure
Before scaling AI features to production (cost forecasting requires historical per-request data)
Prerequisite: EAAPL-OBS001 telemetry provides the per-request cost tagging data stream

When NOT to Apply

Proof-of-concept with < 30-day lifespan where cost attribution is irrelevant
Single-team, single-model, single-use-case deployments where all costs are already attributed

Prerequisites

Prerequisite	Required	Notes
EAAPL-OBS001 AI Telemetry with costUsd field	Required	Per-request cost calculation must happen at instrumentation layer
Team/product/use-case taxonomy	Required	Cost attribution requires organisational context dimensions
Budget approval workflow	Required	Budgets must be defined before alerts can fire
FinOps team or cloud cost management capability	Recommended	Owners for unified cloud + AI cost view

Industry Applicability

Industry	Applicability	Primary Driver
Financial Services	High	Cost allocation to business units; ROI accountability
Technology / SaaS	Critical	Multi-tenant cost attribution; per-customer cost of service
Healthcare	Medium	Cost per clinical decision support interaction
Retail / E-Commerce	High	Cost per recommendation, cost per customer interaction
Government	High	Departmental budget accountability; public spend justification
Education	Medium	Cost per student interaction; institutional budget management

4. Architecture Overview

The AI Cost Observability Architecture is built as a FinOps layer on top of the telemetry infrastructure from EAAPL-OBS001. It transforms per-request cost signals into aggregated cost intelligence across all business dimensions.

Per-Request Cost Tagging Architecture

Cost tagging begins at the instrumentation layer. Every LLM API call is tagged with six context dimensions: team (the engineering team responsible), product (the product or application using AI), use_case (the specific feature or workflow), user_tier (free/paid/enterprise — enabling cost-per-user-segment analysis), model (the specific model called, e.g., gpt-4o, claude-3-5-sonnet), and environment (production/staging/development). These tags are attached to every telemetry record as metadata. The per-request cost is calculated using a cost coefficient table that maps model identifiers to their current USD cost per 1K input tokens and per 1K output tokens. The calculation is: costUsd = (inputTokens / 1000 × inputCostPer1K) + (outputTokens / 1000 × outputCostPer1K). The cost coefficient table is maintained by the AI platform team and updated within 24 hours of any provider pricing change.

Cost Allocation Database

Per-request cost events are streamed to a cost allocation database — a time-series store optimised for aggregation queries. The database schema supports multi-dimensional aggregation: cost by team, by product, by feature, by model, by environment, by time period. The cost allocation database is separate from the general telemetry log store because it requires different query patterns (GROUP BY aggregations over dimensions), different retention (financial records retained 7 years), and different access controls (FinOps and finance team access without requiring access to full AI telemetry).

Budget Alert Engine

Budgets are defined at three levels. Team budgets: each team has a monthly AI spend budget. Alerts fire at 80% consumption (warning) and 100% (breach — page engineering lead). Product budgets: each product or major feature has a daily AI spend cap. If the daily cap is exceeded, an alert fires and optionally triggers automatic rate limiting on that feature. Organisation budget: a weekly organisation-wide AI spend cap. If the weekly cap is reached, an escalation alert fires to CTO and FinOps. Budget configurations are stored in a budget registry with approval workflow — changes require FinOps sign-off.

Cost Anomaly Detection

The cost anomaly engine monitors hourly spend rates and compares them to a rolling 7-day baseline (same hour of day across the past 7 days, to account for intra-day patterns). If the current hourly spend for any team or product exceeds 5x the baseline, a P2 anomaly alert fires. If it exceeds 10x, a P1 alert fires. Anomaly detection operates on the dimension of team × product × model to isolate which combination is driving the spike, enabling faster diagnosis.

Cost-Per-Outcome Metrics

Raw cost numbers are only part of the story. Cost-per-outcome metrics connect AI spend to business value. Three standard cost-per-outcome metrics are defined. Cost per successful query: total AI cost for a query session / number of sessions where the user completed their intended task (measured by task completion signal). Cost per resolved ticket: total AI cost for customer service AI / number of tickets closed without human escalation. Cost per processed document: total AI cost for document processing workflows / number of documents successfully processed. These metrics are configured per use case and tracked on the cost dashboard alongside raw spend.

Optimisation Recommendation Engine

The recommendation engine runs weekly and identifies the top three cost reduction opportunities across the organisation. It analyses four optimisation dimensions. Model right-sizing: for each use case, identifies whether a cheaper model achieves equivalent quality (by comparing output quality metrics between current and cheaper models on the same query distribution). Caching opportunities: identifies high-repetition query patterns where semantic caching could reduce LLM calls. Prompt optimisation: identifies use cases where average prompt token count is significantly above the median (suggesting prompt inefficiency). Batch processing: identifies use cases where requests could be batched and processed asynchronously with a batch-optimised model rather than real-time. The top three recommendations are included in the weekly cost attribution report with estimated savings.

5. Architecture Diagram

ARCHITECTURE DIAGRAM

flowchart TD subgraph Instrumentation["Instrumentation Layer"] A[LLM API Call] B[Cost Calculator] C[Cost Event with Tags] end subgraph Storage["Cost Storage"] D[(Cost Allocation DB)] E[Budget Registry] end subgraph Intelligence["Cost Intelligence"] F[Budget Alert Engine] G[Anomaly Detector] H[Optimisation Engine] end A --> B B -->|tokens x coefficient| C C --> D D --> E D --> F D --> G D --> H F --> I[Alerts + Rate Limits] G --> I H --> J[FinOps Dashboard] D --> J style A fill:#dbeafe,stroke:#3b82f6 style B fill:#f0fdf4,stroke:#22c55e style C fill:#f0fdf4,stroke:#22c55e style D fill:#fef9c3,stroke:#eab308 style E fill:#fef9c3,stroke:#eab308 style F fill:#f0fdf4,stroke:#22c55e style G fill:#f0fdf4,stroke:#22c55e style H fill:#f0fdf4,stroke:#22c55e style I fill:#fee2e2,stroke:#ef4444 style J fill:#d1fae5,stroke:#10b981

6. Components

Component	Type	Responsibility	Technology Options	Criticality
Cost Calculator	SDK Library	Compute requestCostUsd from token counts × cost coefficient table	Built into AI Client Wrapper (EAAPL-OBS001); cost table in config or secrets	Critical
Cost Coefficient Table	Configuration	Maps modelId to USD cost per 1K input + output tokens; kept current	JSON config file; DynamoDB table; managed centrally by AI platform team	Critical
Cost Allocation Database	Storage	Time-series cost events with full dimension tagging; optimised for aggregation	ClickHouse, BigQuery, Redshift, TimescaleDB	Critical
Budget Registry	Storage	Budget definitions with team/product/org hierarchy; approval workflow	PostgreSQL; Airtable; custom service	High
Budget Alert Engine	Stream Processor	Monitor budget consumption; fire alerts at thresholds	Custom service querying cost DB hourly; Grafana alerting	Critical
Cost Anomaly Engine	Stream Processor	Detect anomalous spend rates vs baseline	Streaming job on cost events; hourly batch comparison	High
Cost Attribution Dashboard	UI	Self-serve cost visibility by every dimension	Grafana, Metabase, Looker, Tableau, custom React	High
Cost-Per-Outcome Calculator	Batch Job	Join cost events with outcome signals; compute cost efficiency metrics	Custom aggregation; Airflow/Prefect scheduled	Medium
Optimisation Recommendation Engine	Batch Job	Weekly analysis of top 3 cost reduction opportunities	Custom Python analysis; ML-based right-sizing evaluation	Medium
FinOps Integration	Integration	Unified view of cloud + AI costs	AWS Cost and Usage Report + custom join; Azure Cost Management; GCP Billing Export	High
Monthly Attribution Report	Reporting	Automated cost attribution report for business unit owners	Scheduled query export + email; Looker scheduled report	Medium

7. Data Flow

Primary Flow

Step	Actor	Action	Output
1	Application Code	Sets context tags: team, product, use_case, user_tier, model, environment on AI client	Tagged AI client context
2	AI Client Wrapper	Makes LLM API call; receives response with usage (inputTokens, outputTokens)	Token counts
3	Cost Calculator	Looks up model cost coefficients; computes costUsd	costUsd field added to telemetry record
4	OTel Collector	Receives cost event with full tags; batches; forwards to cost allocation DB	Cost event in cost DB
5	Budget Consumption Monitor	Aggregates current spend by team/product/org; compares to budgets	Budget consumption percentages
6	Alert Engine	Evaluates thresholds; fires alerts where breached	Alerts to PagerDuty/Slack/email
7	Cost Anomaly Engine	Computes hourly rate vs baseline; evaluates spike thresholds	Anomaly alerts on spikes
8	Attribution Dashboard	Aggregates cost by dimension for self-serve reporting	Real-time cost dashboards
9	Optimisation Engine (weekly)	Analyses cost patterns; generates top 3 recommendations	Recommendation report
10	Monthly Report	Generates formal attribution report; distributes to team leads and FinOps	PDF/CSV report with YoY and MoM trends

Error Flow

Error Scenario	Detection	Action	Recovery
Cost coefficient table missing entry for new model	costUsd = null in cost events	Alert AI platform team; estimate cost from similar model pending update	Add new model within 24 hours; backfill null cost estimates
Cost allocation DB ingestion lag	Ingestion lag metric > 30 minutes	Alert platform engineering; budget alerts may be delayed	Restore ingestion; accept delayed alerts during lag window
Anomaly detection false positive (expected traffic spike)	Alert volume spike; engineering investigation finds legitimate traffic increase	Suppress alert; adjust baseline with annotated known event	Annotate event in calendar; tuned baseline excludes known events
Budget overrun not caught by alert	Post-hoc FinOps review finds overspend	Root cause alert configuration failure; fix alert rules	Remediate; refund or recharge overspend to team budget
Cloud billing API unavailable	Unified view shows AI API cost only	Alert FinOps; display AI cost only with note	Restore when billing API recovers

8. Security Considerations

Authentication: Cost allocation database access requires service account authentication for ingestion. Dashboard access requires SSO with RBAC. API keys for cloud billing APIs stored in secrets manager.

Authorisation: Team-level cost data accessible to that team's leads and members. Cross-team cost visibility requires FinOps or VP-level role. Organisation-level cost data restricted to CTO, CFO, FinOps, and AI platform team.

Secrets Management: Model API keys are tagged cost centres, not stored in cost DB. Cloud provider billing API credentials in secrets manager. Cost DB connection strings in secrets manager.

Data Classification: Cost data is classified as Internal. Aggregated cost-per-team data shared with finance is Confidential. Cost data combined with user behaviour data is Confidential.

Encryption: Cost allocation database encrypted at rest (AES-256) and in transit (TLS 1.3). Monthly attribution reports contain financial data and are encrypted at rest and transmitted over secure channels.

Auditability: All budget definition changes are audited with requester, approver, timestamp. Alert suppression events are logged. FinOps access to cost data is logged.

OWASP LLM Top 10 Coverage

OWASP LLM Risk	Cost Observability Control	Implementation
LLM01 Prompt Injection	Injection attacks may inflate token costs	Cost anomaly detection catches injection-driven token spikes
LLM02 Insecure Output Handling	Unusually long outputs from injection increase cost	Output token anomaly detection in cost spike analysis
LLM03 Training Data Poisoning	Out of scope for cost observability	Covered by drift detection pattern
LLM04 Model Denial of Service	Abusive usage drives cost spikes	Cost anomaly detection (10x spike = P1 alert) directly detects DoS cost impact
LLM05 Supply Chain Vulnerabilities	Unexpected model switch may change cost profile	Model dimension in cost tagging detects unexpected model changes via cost shift
LLM06 Sensitive Information Disclosure	Long prompts containing PII inflate cost	Cost per user anomaly may surface individual PII-heavy usage
LLM07 Insecure Plugin Design	Tool calls add cost beyond base LLM inference	Tool call cost tracking dimension
LLM08 Excessive Agency	Agentic loops can drive runaway costs	Cost per session cap; anomaly detection on per-session cost
LLM09 Overreliance	High cost per outcome signals low AI effectiveness	Cost-per-outcome metrics directly measure overreliance consequence
LLM10 Model Theft	Bulk extraction drives unusual cost patterns	Cost per API key anomaly detection

9. Governance Considerations

Responsible AI: Cost observability is a governance prerequisite. Organisations cannot govern AI deployment responsibly if they cannot account for its cost. Cost visibility enables ROI-based decisions about which AI use cases justify continued investment.

Model Risk Management: For regulated financial institutions, AI operational costs are part of the total cost of model ownership. Cost trends feed model lifecycle reviews.

Human Approval: Budget definitions require FinOps and team lead approval. Budget increases above 50% of current budget require VP-level approval. Emergency budget overrides require CFO notification.

Policy: AI cost management policy must define: budget allocation process, alert threshold configuration ownership, response obligations when budgets are breached, cost attribution methodology, and FinOps review cadence.

Traceability: Every cost event is traceable from the business unit cost line to the specific model calls that generated it. This traceability supports internal chargeback, external billing (for AI-as-a-service products), and audit.

Governance Artefacts

Artefact	Owner	Frequency	Format
AI Cost Attribution Report	FinOps + AI Platform	Monthly	Automated PDF/CSV to team leads and finance
Budget Registry with Approval Log	FinOps	Continuous	Version-controlled configuration + approval records
Cost Anomaly Incident Log	AI Platform	Per incident	Linked to incident management system
Optimisation Recommendation Tracker	AI Engineering	Weekly	Recommendation report + implementation tracking
AI ROI Report	AI Platform + Finance	Quarterly	Cost-per-outcome metrics vs business value
Annual FinOps Review	CFO + CTO + FinOps	Annual	Strategic AI investment review

10. Operational Considerations

Monitoring: The cost observability system is itself monitored. Cost allocation DB ingestion health, budget alert engine availability, and attribution dashboard freshness are tracked. Stale cost data (> 2 hours old) triggers an alert.

Logging: Cost allocation DB maintains a write-ahead log. All budget changes and alert actions are in immutable audit log.

Incident Response: A cost incident (P1 = budget exceeded or 10x spike) triggers the AI incident management process (EAAPL-OBS004) with cost incident category. FinOps is notified in addition to engineering on-call.

Disaster Recovery: Cost allocation DB is the highest-durability component. Financial records are replicated across availability zones. Cost events can be replayed from the OTel log store if the cost DB requires restoration.

Capacity Planning: Cost allocation DB grows with inference volume. At 1M requests/day with full dimension tagging, each record is approximately 200 bytes. Daily storage is approximately 200MB; annual is approximately 70GB. Columnar compression reduces this by 80%.

SLO Table

SLO	Target	Measurement	Alert Threshold
Cost data freshness	< 30 minutes lag from inference to cost DB	Ingestion lag metric	> 60 minutes
Budget alert delivery	< 10 minutes from threshold breach to alert	Alert delivery timestamp	> 20 minutes
Attribution dashboard data staleness	< 1 hour for dashboards	Dashboard data timestamp	> 2 hours
Monthly report delivery	By 3rd business day of following month	Report send timestamp	Missed delivery date

Disaster Recovery Table

Component	RTO	RPO	Recovery Approach
Cost Allocation DB	30 minutes	1 hour	Replicated DB; restore from OTel replay if needed
Budget Alert Engine	15 minutes	N/A (stateless rules)	Auto-restart; budgets stored in registry
Attribution Dashboard	60 minutes	N/A (read-only)	Redeploy from version control
Optimisation Engine	24 hours	N/A (weekly batch)	Re-run weekly job

11. Cost Considerations

Cost Drivers

Driver	Description	Relative Cost
Cost allocation DB storage and compute	Time-series DB with aggregation; scales with inference volume	Medium
FinOps tooling licenses	Cloud cost management tools, BI tools for dashboards	Medium
Engineering time for cost management	Budget reviews, anomaly investigation, optimisation implementation	High (human time)
Optimisation recommendation engine	Weekly batch compute; relatively small	Low

Scaling Risks: Cost allocation DB query performance can degrade with very high cardinality dimensions (per-user cost tracking). Use aggregated user_tier dimension for metrics; per-user cost tracking in logs only.

Optimisations:

Partition cost DB by date and team; most queries are recent + team-scoped
Materialise common aggregations (daily team cost, hourly anomaly baseline) as scheduled views
Use columnar storage format (Parquet in BigQuery/ClickHouse) for 80% storage reduction

Indicative Cost Range

Scale	AI Spend Being Tracked/Month	Estimated Cost Observability Infrastructure/Month
Small	$1,000–$10,000	$200–$500
Medium	$10,000–$100,000	$500–$2,000
Large	$100,000–$1,000,000	$2,000–$8,000
Enterprise	$1,000,000+	$5,000–$20,000 (< 2% of managed spend)

12. Trade-Off Analysis

Approach Comparison

Approach	Pros	Cons	Best For
Per-request cost tagging with full dimension taxonomy	Precise attribution; anomaly detection; optimisation recommendations; team accountability	Requires instrumentation discipline; cost coefficient table maintenance; additional storage	Organisations with multiple teams; enterprise AI spend; mature platform engineering
Cloud billing report only	Zero instrumentation effort; authoritative financial data	24–48 hour lag; no per-request detail; no team/product attribution without cost allocation tags in billing	Single-team, single-product deployments; quick start
Vendor cost management (e.g., OpenAI usage dashboard, Anthropic console)	No infrastructure; per-request data available; some tagging	Single-vendor; no cross-model view; no business dimension attribution; no anomaly detection	Teams using a single AI provider; early-stage cost monitoring

Architectural Tensions

Tension	Description	Resolution
Granularity vs. Storage cost	Per-request records with 6 dimensions create large storage volumes	Use columnar compression; partition by date; aggregate old data
Accuracy vs. Timeliness	Per-request calculation is accurate but requires current coefficient table; billing API is authoritative but 24–48h delayed	Use per-request for operational monitoring; reconcile with billing monthly
Chargeback vs. Collaboration	Per-team cost tracking enables accountability but creates disincentive to share AI infrastructure	Use cost visibility for transparency, not punitive chargeback; shared savings from optimisations distributed back to teams
Alerting vs. Alert fatigue	Many budget thresholds create alert volume	Grade alerts: 80% = informational notification; 100% team = Slack; 100% org = page

13. Failure Modes

Failure	Likelihood	Impact	Detection	Recovery
Cost coefficient table out of date	Medium	Medium (underestimated costs)	Reconciliation variance between per-request estimate and billing	Update table; recompute estimates
Budget configured too high (no real control)	Medium	Medium (overspend not caught)	Annual budget review; anomaly detection still catches spikes	Tighten budgets based on actual spend history
Cost anomaly suppressed as false positive when real	Low	High (overspend undetected)	Anomaly suppression rate metric; FinOps monthly reconciliation	Never suppress without investigation; require written rationale
Per-request cost tags missing (untagged requests)	Medium	Medium (attribution gap)	Untagged request count metric; alert if > 5% untagged	Enforce tagging in AI client wrapper; block untagged requests in dev
Monthly report automation fails	Low	Low (manual workaround)	Report delivery failure alert	Manual report generation; fix automation

Cascading Scenarios

Scenario 1: New AI feature deployed without product budget configuration → budget alert never configured → feature goes viral → $50K AI spend in first weekend with no alert → monthly budget exhausted in 3 days. Mitigation: mandatory budget registration as deployment gate; no deployment without budget entry.
Scenario 2: Cost coefficient table not updated after model pricing change → actual costs underestimated by 40% → team thinks they're within budget → monthly bill reveals significant overrun. Mitigation: pricing change monitoring (subscribe to provider pricing changelog); automated coefficient table update.

14. Regulatory Considerations

Regulation	Clause	Requirement	AI Cost Observability Implementation
APRA CPS 230	Para 53 (Operational Risk)	Technology cost is an operational risk; AI costs qualify	Cost observability enables cost risk management and budget control
APRA CPS 230	Para 35 (Service Provider Management)	Third-party AI provider costs must be managed and monitored	Per-model cost tracking enables provider spend management
Privacy Act 1988 (AU)	APP 11 (Security)	Cost signals must not be derived in a way that leaks individual PII	Aggregated user_tier dimension; not per-user cost tracking
EU AI Act	Article 9 (Risk Management)	Cost management is part of AI operational risk management	Budget controls reduce financial risk of AI deployment
ISO/IEC 42001	Clause 8.5 (AI System Lifecycle)	Resource management for AI systems must be documented	Cost observability provides the technical foundation for resource management
NIST AI RMF	GOVERN 1.1	Organisational policies for AI management including resource allocation	Cost attribution and budget management implements resource allocation governance

15. Reference Implementations

AWS

Cost Tagging: AI Client Wrapper with context propagation; AWS tags on Bedrock requests
Cost DB: Amazon Redshift (columnar); or ClickHouse on EC2 for simpler deployments
Budget Alerts: Custom Lambda querying cost DB; CloudWatch Alarms on custom metrics
Anomaly Detection: Amazon Lookout for Metrics; custom CloudWatch anomaly detection
Attribution Dashboard: Amazon QuickSight with direct query to Redshift
FinOps Integration: AWS Cost and Usage Report + custom Athena query joining AI cost data
Monthly Report: Scheduled Lambda + SES for report delivery

Azure

Cost Tagging: AI Client Wrapper with Azure context propagation
Cost DB: Azure Synapse Analytics (Serverless SQL Pool); Azure Data Lake Storage
Budget Alerts: Azure Monitor custom metrics; Azure Cost Management Budgets for cloud costs
Anomaly Detection: Azure Anomaly Detector API on cost time series
Attribution Dashboard: Power BI Direct Query to Synapse
FinOps Integration: Azure Cost Management Export + custom join in Synapse
Monthly Report: Power BI Scheduled Report delivery

GCP

Cost Tagging: AI Client Wrapper; Vertex AI resource labels
Cost DB: BigQuery (native columnar, pay-per-query; ideal for cost data)
Budget Alerts: BigQuery scheduled queries + Cloud Monitoring custom metrics
Anomaly Detection: BigQuery ML anomaly detection on cost time series
Attribution Dashboard: Looker with BigQuery
FinOps Integration: GCP Billing Export to BigQuery + join with AI cost data
Monthly Report: Looker Scheduled Delivery

On-Premises

Cost Tagging: AI Client Wrapper
Cost DB: ClickHouse (excellent columnar performance; open source)
Budget Alerts: Custom Python service polling ClickHouse; Alertmanager for routing
Anomaly Detection: Custom Prophet or statsmodels implementation
Attribution Dashboard: Grafana with ClickHouse data source
FinOps Integration: Manual reconciliation with cloud billing data; custom ETL
Monthly Report: Scheduled SQL query + CSV export via email

Pattern ID	Pattern Name	Relationship	Notes
EAAPL-OBS001	AI Telemetry Architecture	Foundation	costUsd field and dimension tags defined in OBS001; this pattern builds the FinOps layer on top
EAAPL-OBS004	AI Incident Management	Depends On	Cost incidents (budget breach, anomaly) trigger incident management process
EAAPL-OBS002	Prompt Monitoring	Sibling	Prompt token length anomalies detected in OBS002 are a cost driver; shared alerting

17. Maturity Assessment

Overall Maturity: Proven

Dimension	Score (1–5)	Rationale
Adoption Breadth	3	Mature in FinOps-sophisticated organisations; still largely absent in mid-market
Tooling Ecosystem	4	ClickHouse, BigQuery, Grafana, Looker are mature; AI-specific FinOps tools emerging
Operational Runbook Coverage	4	Budget management and cost anomaly runbooks well-established in mature orgs
Regulatory Evidence	3	Not directly mandated; implicit in operational risk management frameworks
Cost Predictability	5	The pattern's primary value: makes AI costs predictable
Team Skill Availability	4	FinOps and data engineering skills broadly available

18. Revision History

Version	Date	Author	Changes
1.0.0	2026-06-12	EAAPL Working Group	Initial publication

Track this pattern for APRA/ASIC review

← Back to Library More Observability & Monitoring →

EAAPL-OBS006 · AI Cost Observability

EAAPL-OBS006 · AI Cost Observability

1. Executive Summary

2. Problem Statement

Business Problem

Technical Problem

Symptoms

Cost of Inaction

3. Context

When to Apply

When NOT to Apply

Prerequisites

Industry Applicability

4. Architecture Overview

5. Architecture Diagram

6. Components

7. Data Flow

Primary Flow

Error Flow

8. Security Considerations

OWASP LLM Top 10 Coverage

9. Governance Considerations

Governance Artefacts

10. Operational Considerations

SLO Table

Disaster Recovery Table

11. Cost Considerations

Indicative Cost Range

12. Trade-Off Analysis

Approach Comparison

Architectural Tensions

13. Failure Modes

Cascading Scenarios

14. Regulatory Considerations

15. Reference Implementations

AWS

Azure

GCP

On-Premises

16. Related Patterns

17. Maturity Assessment

18. Revision History