EAAPL-OBS006 · AI Cost Observability
Pattern ID: EAAPL-OBS006
Status: Proven
Complexity: Medium
Tags: cost-optimisation observability alerting llm medium-complexity
Version: 1.0.0
Last Reviewed: 2026-06-12
1. Executive Summary
AI inference costs have a fundamentally different cost structure than traditional compute: they are token-based, per-request, and vary by model and usage pattern. Without purpose-built cost observability, AI spend is invisible until the monthly cloud bill arrives — at which point the overspend has already occurred and attribution to responsible teams is guesswork. Organisations routinely discover AI cost overruns of 200–500% when moving from proof-of-concept to production, driven by longer-than-expected prompts, unexpected usage volumes, or uncontrolled model selection.
This pattern defines full-stack AI cost observability: per-request cost tagging so every inference call is attributed to a team, product, feature, user tier, and model; cost allocation dashboards by every meaningful dimension; budget alerts at team, product, and organisation levels; anomaly detection on cost spikes; cost-per-outcome metrics that connect AI spend to business value; monthly attribution reports for FinOps and executive reporting; unified cloud cost and AI API cost views; and an optimisation recommendation engine that automatically identifies the top three cost reduction opportunities. The outcome is a FinOps-grade cost management capability for AI, comparable to what mature organisations have built for cloud infrastructure.
Target Audience: CIO, CTO, CFO, Head of FinOps, AI Engineering Lead Time to Implement: 4–8 weeks
2. Problem Statement
Business Problem
AI inference costs can be 10–100x the cost of traditional API calls. A single GPT-4-class LLM call processing a long document can cost $0.10–$1.00. At scale, these costs accumulate rapidly. Yet most organisations cannot answer: Which team is responsible for 40% of our AI spend? Is our cost per resolved customer ticket increasing or decreasing? Which product feature has the worst cost-to-value ratio? Without this visibility, AI investment decisions are made on intuition rather than evidence.
Technical Problem
AI API costs flow through a small number of API keys and appear as aggregate line items in cloud billing. Request-level cost is not native to most AI service invoices; it must be calculated from token counts logged at inference time using a model-specific cost coefficient table. This calculation must happen within the application, not from the billing system, because billing data arrives 24–48 hours late and lacks the context dimensions needed for attribution.
Symptoms
- Monthly cloud bill has a large "AI/ML" line item with no sub-attribution
- Teams cannot self-serve their AI cost data; FinOps team manually allocates costs quarterly
- A new AI feature shipped without a cost cap; usage exceeded budget by 400% in the first week
- Cost optimisation conversations start with "which model should we use?" but there is no data to inform the decision
- Business stakeholders cannot relate AI spend to business outcomes; AI ROI questions are unanswerable
Cost of Inaction
- AI cost overruns are a leading reason for AI project cancellation or de-scoping
- Without per-team cost visibility, engineering teams have no incentive to optimise AI usage
- Missed opportunity to identify and implement optimisations that typically reduce AI costs by 30–60%
- Inability to demonstrate AI ROI to executive sponsors undermines AI investment justification
3. Context
When to Apply
- Any organisation with AI API spend exceeding $1,000/month
- Organisations with multiple teams sharing AI infrastructure
- Before scaling AI features to production (cost forecasting requires historical per-request data)
- Prerequisite: EAAPL-OBS001 telemetry provides the per-request cost tagging data stream
When NOT to Apply
- Proof-of-concept with < 30-day lifespan where cost attribution is irrelevant
- Single-team, single-model, single-use-case deployments where all costs are already attributed
Prerequisites
| Prerequisite | Required | Notes |
|---|---|---|
| EAAPL-OBS001 AI Telemetry with costUsd field | Required | Per-request cost calculation must happen at instrumentation layer |
| Team/product/use-case taxonomy | Required | Cost attribution requires organisational context dimensions |
| Budget approval workflow | Required | Budgets must be defined before alerts can fire |
| FinOps team or cloud cost management capability | Recommended | Owners for unified cloud + AI cost view |
Industry Applicability
| Industry | Applicability | Primary Driver |
|---|---|---|
| Financial Services | High | Cost allocation to business units; ROI accountability |
| Technology / SaaS | Critical | Multi-tenant cost attribution; per-customer cost of service |
| Healthcare | Medium | Cost per clinical decision support interaction |
| Retail / E-Commerce | High | Cost per recommendation, cost per customer interaction |
| Government | High | Departmental budget accountability; public spend justification |
| Education | Medium | Cost per student interaction; institutional budget management |
4. Architecture Overview
The AI Cost Observability Architecture is built as a FinOps layer on top of the telemetry infrastructure from EAAPL-OBS001. It transforms per-request cost signals into aggregated cost intelligence across all business dimensions.
Per-Request Cost Tagging Architecture
Cost tagging begins at the instrumentation layer. Every LLM API call is tagged with six context dimensions: team (the engineering team responsible), product (the product or application using AI), use_case (the specific feature or workflow), user_tier (free/paid/enterprise — enabling cost-per-user-segment analysis), model (the specific model called, e.g., gpt-4o, claude-3-5-sonnet), and environment (production/staging/development). These tags are attached to every telemetry record as metadata. The per-request cost is calculated using a cost coefficient table that maps model identifiers to their current USD cost per 1K input tokens and per 1K output tokens. The calculation is: costUsd = (inputTokens / 1000 × inputCostPer1K) + (outputTokens / 1000 × outputCostPer1K). The cost coefficient table is maintained by the AI platform team and updated within 24 hours of any provider pricing change.
Cost Allocation Database
Per-request cost events are streamed to a cost allocation database — a time-series store optimised for aggregation queries. The database schema supports multi-dimensional aggregation: cost by team, by product, by feature, by model, by environment, by time period. The cost allocation database is separate from the general telemetry log store because it requires different query patterns (GROUP BY aggregations over dimensions), different retention (financial records retained 7 years), and different access controls (FinOps and finance team access without requiring access to full AI telemetry).
Budget Alert Engine
Budgets are defined at three levels. Team budgets: each team has a monthly AI spend budget. Alerts fire at 80% consumption (warning) and 100% (breach — page engineering lead). Product budgets: each product or major feature has a daily AI spend cap. If the daily cap is exceeded, an alert fires and optionally triggers automatic rate limiting on that feature. Organisation budget: a weekly organisation-wide AI spend cap. If the weekly cap is reached, an escalation alert fires to CTO and FinOps. Budget configurations are stored in a budget registry with approval workflow — changes require FinOps sign-off.
Cost Anomaly Detection
The cost anomaly engine monitors hourly spend rates and compares them to a rolling 7-day baseline (same hour of day across the past 7 days, to account for intra-day patterns). If the current hourly spend for any team or product exceeds 5x the baseline, a P2 anomaly alert fires. If it exceeds 10x, a P1 alert fires. Anomaly detection operates on the dimension of team × product × model to isolate which combination is driving the spike, enabling faster diagnosis.
Cost-Per-Outcome Metrics
Raw cost numbers are only part of the story. Cost-per-outcome metrics connect AI spend to business value. Three standard cost-per-outcome metrics are defined. Cost per successful query: total AI cost for a query session / number of sessions where the user completed their intended task (measured by task completion signal). Cost per resolved ticket: total AI cost for customer service AI / number of tickets closed without human escalation. Cost per processed document: total AI cost for document processing workflows / number of documents successfully processed. These metrics are configured per use case and tracked on the cost dashboard alongside raw spend.
Optimisation Recommendation Engine
The recommendation engine runs weekly and identifies the top three cost reduction opportunities across the organisation. It analyses four optimisation dimensions. Model right-sizing: for each use case, identifies whether a cheaper model achieves equivalent quality (by comparing output quality metrics between current and cheaper models on the same query distribution). Caching opportunities: identifies high-repetition query patterns where semantic caching could reduce LLM calls. Prompt optimisation: identifies use cases where average prompt token count is significantly above the median (suggesting prompt inefficiency). Batch processing: identifies use cases where requests could be batched and processed asynchronously with a batch-optimised model rather than real-time. The top three recommendations are included in the weekly cost attribution report with estimated savings.
5. Architecture Diagram
6. Components
| Component | Type | Responsibility | Technology Options | Criticality |
|---|---|---|---|---|
| Cost Calculator | SDK Library | Compute requestCostUsd from token counts × cost coefficient table | Built into AI Client Wrapper (EAAPL-OBS001); cost table in config or secrets | Critical |
| Cost Coefficient Table | Configuration | Maps modelId to USD cost per 1K input + output tokens; kept current | JSON config file; DynamoDB table; managed centrally by AI platform team | Critical |
| Cost Allocation Database | Storage | Time-series cost events with full dimension tagging; optimised for aggregation | ClickHouse, BigQuery, Redshift, TimescaleDB | Critical |
| Budget Registry | Storage | Budget definitions with team/product/org hierarchy; approval workflow | PostgreSQL; Airtable; custom service | High |
| Budget Alert Engine | Stream Processor | Monitor budget consumption; fire alerts at thresholds | Custom service querying cost DB hourly; Grafana alerting | Critical |
| Cost Anomaly Engine | Stream Processor | Detect anomalous spend rates vs baseline | Streaming job on cost events; hourly batch comparison | High |
| Cost Attribution Dashboard | UI | Self-serve cost visibility by every dimension | Grafana, Metabase, Looker, Tableau, custom React | High |
| Cost-Per-Outcome Calculator | Batch Job | Join cost events with outcome signals; compute cost efficiency metrics | Custom aggregation; Airflow/Prefect scheduled | Medium |
| Optimisation Recommendation Engine | Batch Job | Weekly analysis of top 3 cost reduction opportunities | Custom Python analysis; ML-based right-sizing evaluation | Medium |
| FinOps Integration | Integration | Unified view of cloud + AI costs | AWS Cost and Usage Report + custom join; Azure Cost Management; GCP Billing Export | High |
| Monthly Attribution Report | Reporting | Automated cost attribution report for business unit owners | Scheduled query export + email; Looker scheduled report | Medium |
7. Data Flow
Primary Flow
| Step | Actor | Action | Output |
|---|---|---|---|
| 1 | Application Code | Sets context tags: team, product, use_case, user_tier, model, environment on AI client | Tagged AI client context |
| 2 | AI Client Wrapper | Makes LLM API call; receives response with usage (inputTokens, outputTokens) | Token counts |
| 3 | Cost Calculator | Looks up model cost coefficients; computes costUsd | costUsd field added to telemetry record |
| 4 | OTel Collector | Receives cost event with full tags; batches; forwards to cost allocation DB | Cost event in cost DB |
| 5 | Budget Consumption Monitor | Aggregates current spend by team/product/org; compares to budgets | Budget consumption percentages |
| 6 | Alert Engine | Evaluates thresholds; fires alerts where breached | Alerts to PagerDuty/Slack/email |
| 7 | Cost Anomaly Engine | Computes hourly rate vs baseline; evaluates spike thresholds | Anomaly alerts on spikes |
| 8 | Attribution Dashboard | Aggregates cost by dimension for self-serve reporting | Real-time cost dashboards |
| 9 | Optimisation Engine (weekly) | Analyses cost patterns; generates top 3 recommendations | Recommendation report |
| 10 | Monthly Report | Generates formal attribution report; distributes to team leads and FinOps | PDF/CSV report with YoY and MoM trends |
Error Flow
| Error Scenario | Detection | Action | Recovery |
|---|---|---|---|
| Cost coefficient table missing entry for new model | costUsd = null in cost events | Alert AI platform team; estimate cost from similar model pending update | Add new model within 24 hours; backfill null cost estimates |
| Cost allocation DB ingestion lag | Ingestion lag metric > 30 minutes | Alert platform engineering; budget alerts may be delayed | Restore ingestion; accept delayed alerts during lag window |
| Anomaly detection false positive (expected traffic spike) | Alert volume spike; engineering investigation finds legitimate traffic increase | Suppress alert; adjust baseline with annotated known event | Annotate event in calendar; tuned baseline excludes known events |
| Budget overrun not caught by alert | Post-hoc FinOps review finds overspend | Root cause alert configuration failure; fix alert rules | Remediate; refund or recharge overspend to team budget |
| Cloud billing API unavailable | Unified view shows AI API cost only | Alert FinOps; display AI cost only with note | Restore when billing API recovers |
8. Security Considerations
Authentication: Cost allocation database access requires service account authentication for ingestion. Dashboard access requires SSO with RBAC. API keys for cloud billing APIs stored in secrets manager.
Authorisation: Team-level cost data accessible to that team's leads and members. Cross-team cost visibility requires FinOps or VP-level role. Organisation-level cost data restricted to CTO, CFO, FinOps, and AI platform team.
Secrets Management: Model API keys are tagged cost centres, not stored in cost DB. Cloud provider billing API credentials in secrets manager. Cost DB connection strings in secrets manager.
Data Classification: Cost data is classified as Internal. Aggregated cost-per-team data shared with finance is Confidential. Cost data combined with user behaviour data is Confidential.
Encryption: Cost allocation database encrypted at rest (AES-256) and in transit (TLS 1.3). Monthly attribution reports contain financial data and are encrypted at rest and transmitted over secure channels.
Auditability: All budget definition changes are audited with requester, approver, timestamp. Alert suppression events are logged. FinOps access to cost data is logged.
OWASP LLM Top 10 Coverage
| OWASP LLM Risk | Cost Observability Control | Implementation |
|---|---|---|
| LLM01 Prompt Injection | Injection attacks may inflate token costs | Cost anomaly detection catches injection-driven token spikes |
| LLM02 Insecure Output Handling | Unusually long outputs from injection increase cost | Output token anomaly detection in cost spike analysis |
| LLM03 Training Data Poisoning | Out of scope for cost observability | Covered by drift detection pattern |
| LLM04 Model Denial of Service | Abusive usage drives cost spikes | Cost anomaly detection (10x spike = P1 alert) directly detects DoS cost impact |
| LLM05 Supply Chain Vulnerabilities | Unexpected model switch may change cost profile | Model dimension in cost tagging detects unexpected model changes via cost shift |
| LLM06 Sensitive Information Disclosure | Long prompts containing PII inflate cost | Cost per user anomaly may surface individual PII-heavy usage |
| LLM07 Insecure Plugin Design | Tool calls add cost beyond base LLM inference | Tool call cost tracking dimension |
| LLM08 Excessive Agency | Agentic loops can drive runaway costs | Cost per session cap; anomaly detection on per-session cost |
| LLM09 Overreliance | High cost per outcome signals low AI effectiveness | Cost-per-outcome metrics directly measure overreliance consequence |
| LLM10 Model Theft | Bulk extraction drives unusual cost patterns | Cost per API key anomaly detection |
9. Governance Considerations
Responsible AI: Cost observability is a governance prerequisite. Organisations cannot govern AI deployment responsibly if they cannot account for its cost. Cost visibility enables ROI-based decisions about which AI use cases justify continued investment.
Model Risk Management: For regulated financial institutions, AI operational costs are part of the total cost of model ownership. Cost trends feed model lifecycle reviews.
Human Approval: Budget definitions require FinOps and team lead approval. Budget increases above 50% of current budget require VP-level approval. Emergency budget overrides require CFO notification.
Policy: AI cost management policy must define: budget allocation process, alert threshold configuration ownership, response obligations when budgets are breached, cost attribution methodology, and FinOps review cadence.
Traceability: Every cost event is traceable from the business unit cost line to the specific model calls that generated it. This traceability supports internal chargeback, external billing (for AI-as-a-service products), and audit.
Governance Artefacts
| Artefact | Owner | Frequency | Format |
|---|---|---|---|
| AI Cost Attribution Report | FinOps + AI Platform | Monthly | Automated PDF/CSV to team leads and finance |
| Budget Registry with Approval Log | FinOps | Continuous | Version-controlled configuration + approval records |
| Cost Anomaly Incident Log | AI Platform | Per incident | Linked to incident management system |
| Optimisation Recommendation Tracker | AI Engineering | Weekly | Recommendation report + implementation tracking |
| AI ROI Report | AI Platform + Finance | Quarterly | Cost-per-outcome metrics vs business value |
| Annual FinOps Review | CFO + CTO + FinOps | Annual | Strategic AI investment review |
10. Operational Considerations
Monitoring: The cost observability system is itself monitored. Cost allocation DB ingestion health, budget alert engine availability, and attribution dashboard freshness are tracked. Stale cost data (> 2 hours old) triggers an alert.
Logging: Cost allocation DB maintains a write-ahead log. All budget changes and alert actions are in immutable audit log.
Incident Response: A cost incident (P1 = budget exceeded or 10x spike) triggers the AI incident management process (EAAPL-OBS004) with cost incident category. FinOps is notified in addition to engineering on-call.
Disaster Recovery: Cost allocation DB is the highest-durability component. Financial records are replicated across availability zones. Cost events can be replayed from the OTel log store if the cost DB requires restoration.
Capacity Planning: Cost allocation DB grows with inference volume. At 1M requests/day with full dimension tagging, each record is approximately 200 bytes. Daily storage is approximately 200MB; annual is approximately 70GB. Columnar compression reduces this by 80%.
SLO Table
| SLO | Target | Measurement | Alert Threshold |
|---|---|---|---|
| Cost data freshness | < 30 minutes lag from inference to cost DB | Ingestion lag metric | > 60 minutes |
| Budget alert delivery | < 10 minutes from threshold breach to alert | Alert delivery timestamp | > 20 minutes |
| Attribution dashboard data staleness | < 1 hour for dashboards | Dashboard data timestamp | > 2 hours |
| Monthly report delivery | By 3rd business day of following month | Report send timestamp | Missed delivery date |
Disaster Recovery Table
| Component | RTO | RPO | Recovery Approach |
|---|---|---|---|
| Cost Allocation DB | 30 minutes | 1 hour | Replicated DB; restore from OTel replay if needed |
| Budget Alert Engine | 15 minutes | N/A (stateless rules) | Auto-restart; budgets stored in registry |
| Attribution Dashboard | 60 minutes | N/A (read-only) | Redeploy from version control |
| Optimisation Engine | 24 hours | N/A (weekly batch) | Re-run weekly job |
11. Cost Considerations
Cost Drivers
| Driver | Description | Relative Cost |
|---|---|---|
| Cost allocation DB storage and compute | Time-series DB with aggregation; scales with inference volume | Medium |
| FinOps tooling licenses | Cloud cost management tools, BI tools for dashboards | Medium |
| Engineering time for cost management | Budget reviews, anomaly investigation, optimisation implementation | High (human time) |
| Optimisation recommendation engine | Weekly batch compute; relatively small | Low |
Scaling Risks: Cost allocation DB query performance can degrade with very high cardinality dimensions (per-user cost tracking). Use aggregated user_tier dimension for metrics; per-user cost tracking in logs only.
Optimisations:
- Partition cost DB by date and team; most queries are recent + team-scoped
- Materialise common aggregations (daily team cost, hourly anomaly baseline) as scheduled views
- Use columnar storage format (Parquet in BigQuery/ClickHouse) for 80% storage reduction
Indicative Cost Range
| Scale | AI Spend Being Tracked/Month | Estimated Cost Observability Infrastructure/Month |
|---|---|---|
| Small | $1,000–$10,000 | $200–$500 |
| Medium | $10,000–$100,000 | $500–$2,000 |
| Large | $100,000–$1,000,000 | $2,000–$8,000 |
| Enterprise | $1,000,000+ | $5,000–$20,000 (< 2% of managed spend) |
12. Trade-Off Analysis
Approach Comparison
| Approach | Pros | Cons | Best For |
|---|---|---|---|
| Per-request cost tagging with full dimension taxonomy | Precise attribution; anomaly detection; optimisation recommendations; team accountability | Requires instrumentation discipline; cost coefficient table maintenance; additional storage | Organisations with multiple teams; enterprise AI spend; mature platform engineering |
| Cloud billing report only | Zero instrumentation effort; authoritative financial data | 24–48 hour lag; no per-request detail; no team/product attribution without cost allocation tags in billing | Single-team, single-product deployments; quick start |
| Vendor cost management (e.g., OpenAI usage dashboard, Anthropic console) | No infrastructure; per-request data available; some tagging | Single-vendor; no cross-model view; no business dimension attribution; no anomaly detection | Teams using a single AI provider; early-stage cost monitoring |
Architectural Tensions
| Tension | Description | Resolution |
|---|---|---|
| Granularity vs. Storage cost | Per-request records with 6 dimensions create large storage volumes | Use columnar compression; partition by date; aggregate old data |
| Accuracy vs. Timeliness | Per-request calculation is accurate but requires current coefficient table; billing API is authoritative but 24–48h delayed | Use per-request for operational monitoring; reconcile with billing monthly |
| Chargeback vs. Collaboration | Per-team cost tracking enables accountability but creates disincentive to share AI infrastructure | Use cost visibility for transparency, not punitive chargeback; shared savings from optimisations distributed back to teams |
| Alerting vs. Alert fatigue | Many budget thresholds create alert volume | Grade alerts: 80% = informational notification; 100% team = Slack; 100% org = page |
13. Failure Modes
| Failure | Likelihood | Impact | Detection | Recovery |
|---|---|---|---|---|
| Cost coefficient table out of date | Medium | Medium (underestimated costs) | Reconciliation variance between per-request estimate and billing | Update table; recompute estimates |
| Budget configured too high (no real control) | Medium | Medium (overspend not caught) | Annual budget review; anomaly detection still catches spikes | Tighten budgets based on actual spend history |
| Cost anomaly suppressed as false positive when real | Low | High (overspend undetected) | Anomaly suppression rate metric; FinOps monthly reconciliation | Never suppress without investigation; require written rationale |
| Per-request cost tags missing (untagged requests) | Medium | Medium (attribution gap) | Untagged request count metric; alert if > 5% untagged | Enforce tagging in AI client wrapper; block untagged requests in dev |
| Monthly report automation fails | Low | Low (manual workaround) | Report delivery failure alert | Manual report generation; fix automation |
Cascading Scenarios
- Scenario 1: New AI feature deployed without product budget configuration → budget alert never configured → feature goes viral → $50K AI spend in first weekend with no alert → monthly budget exhausted in 3 days. Mitigation: mandatory budget registration as deployment gate; no deployment without budget entry.
- Scenario 2: Cost coefficient table not updated after model pricing change → actual costs underestimated by 40% → team thinks they're within budget → monthly bill reveals significant overrun. Mitigation: pricing change monitoring (subscribe to provider pricing changelog); automated coefficient table update.
14. Regulatory Considerations
| Regulation | Clause | Requirement | AI Cost Observability Implementation |
|---|---|---|---|
| APRA CPS 230 | Para 53 (Operational Risk) | Technology cost is an operational risk; AI costs qualify | Cost observability enables cost risk management and budget control |
| APRA CPS 230 | Para 35 (Service Provider Management) | Third-party AI provider costs must be managed and monitored | Per-model cost tracking enables provider spend management |
| Privacy Act 1988 (AU) | APP 11 (Security) | Cost signals must not be derived in a way that leaks individual PII | Aggregated user_tier dimension; not per-user cost tracking |
| EU AI Act | Article 9 (Risk Management) | Cost management is part of AI operational risk management | Budget controls reduce financial risk of AI deployment |
| ISO/IEC 42001 | Clause 8.5 (AI System Lifecycle) | Resource management for AI systems must be documented | Cost observability provides the technical foundation for resource management |
| NIST AI RMF | GOVERN 1.1 | Organisational policies for AI management including resource allocation | Cost attribution and budget management implements resource allocation governance |
15. Reference Implementations
AWS
- Cost Tagging: AI Client Wrapper with context propagation; AWS tags on Bedrock requests
- Cost DB: Amazon Redshift (columnar); or ClickHouse on EC2 for simpler deployments
- Budget Alerts: Custom Lambda querying cost DB; CloudWatch Alarms on custom metrics
- Anomaly Detection: Amazon Lookout for Metrics; custom CloudWatch anomaly detection
- Attribution Dashboard: Amazon QuickSight with direct query to Redshift
- FinOps Integration: AWS Cost and Usage Report + custom Athena query joining AI cost data
- Monthly Report: Scheduled Lambda + SES for report delivery
Azure
- Cost Tagging: AI Client Wrapper with Azure context propagation
- Cost DB: Azure Synapse Analytics (Serverless SQL Pool); Azure Data Lake Storage
- Budget Alerts: Azure Monitor custom metrics; Azure Cost Management Budgets for cloud costs
- Anomaly Detection: Azure Anomaly Detector API on cost time series
- Attribution Dashboard: Power BI Direct Query to Synapse
- FinOps Integration: Azure Cost Management Export + custom join in Synapse
- Monthly Report: Power BI Scheduled Report delivery
GCP
- Cost Tagging: AI Client Wrapper; Vertex AI resource labels
- Cost DB: BigQuery (native columnar, pay-per-query; ideal for cost data)
- Budget Alerts: BigQuery scheduled queries + Cloud Monitoring custom metrics
- Anomaly Detection: BigQuery ML anomaly detection on cost time series
- Attribution Dashboard: Looker with BigQuery
- FinOps Integration: GCP Billing Export to BigQuery + join with AI cost data
- Monthly Report: Looker Scheduled Delivery
On-Premises
- Cost Tagging: AI Client Wrapper
- Cost DB: ClickHouse (excellent columnar performance; open source)
- Budget Alerts: Custom Python service polling ClickHouse; Alertmanager for routing
- Anomaly Detection: Custom Prophet or statsmodels implementation
- Attribution Dashboard: Grafana with ClickHouse data source
- FinOps Integration: Manual reconciliation with cloud billing data; custom ETL
- Monthly Report: Scheduled SQL query + CSV export via email
16. Related Patterns
| Pattern ID | Pattern Name | Relationship | Notes |
|---|---|---|---|
| EAAPL-OBS001 | AI Telemetry Architecture | Foundation | costUsd field and dimension tags defined in OBS001; this pattern builds the FinOps layer on top |
| EAAPL-OBS004 | AI Incident Management | Depends On | Cost incidents (budget breach, anomaly) trigger incident management process |
| EAAPL-OBS002 | Prompt Monitoring | Sibling | Prompt token length anomalies detected in OBS002 are a cost driver; shared alerting |
17. Maturity Assessment
Overall Maturity: Proven
| Dimension | Score (1–5) | Rationale |
|---|---|---|
| Adoption Breadth | 3 | Mature in FinOps-sophisticated organisations; still largely absent in mid-market |
| Tooling Ecosystem | 4 | ClickHouse, BigQuery, Grafana, Looker are mature; AI-specific FinOps tools emerging |
| Operational Runbook Coverage | 4 | Budget management and cost anomaly runbooks well-established in mature orgs |
| Regulatory Evidence | 3 | Not directly mandated; implicit in operational risk management frameworks |
| Cost Predictability | 5 | The pattern's primary value: makes AI costs predictable |
| Team Skill Availability | 4 | FinOps and data engineering skills broadly available |
18. Revision History
| Version | Date | Author | Changes |
|---|---|---|---|
| 1.0.0 | 2026-06-12 | EAAPL Working Group | Initial publication |