EAAPLEnterprise AI Architecture Pattern Library
EAAPLLibraryObservability & Monitoring
Proven
⇄ Compare

EAAPL-OBS006 · AI Cost Observability

EAAPL-OBS006 · AI Cost Observability

Pattern ID: EAAPL-OBS006 Status: Proven Complexity: Medium Tags: cost-optimisation observability alerting llm medium-complexity Version: 1.0.0 Last Reviewed: 2026-06-12


1. Executive Summary

AI inference costs have a fundamentally different cost structure than traditional compute: they are token-based, per-request, and vary by model and usage pattern. Without purpose-built cost observability, AI spend is invisible until the monthly cloud bill arrives — at which point the overspend has already occurred and attribution to responsible teams is guesswork. Organisations routinely discover AI cost overruns of 200–500% when moving from proof-of-concept to production, driven by longer-than-expected prompts, unexpected usage volumes, or uncontrolled model selection.

This pattern defines full-stack AI cost observability: per-request cost tagging so every inference call is attributed to a team, product, feature, user tier, and model; cost allocation dashboards by every meaningful dimension; budget alerts at team, product, and organisation levels; anomaly detection on cost spikes; cost-per-outcome metrics that connect AI spend to business value; monthly attribution reports for FinOps and executive reporting; unified cloud cost and AI API cost views; and an optimisation recommendation engine that automatically identifies the top three cost reduction opportunities. The outcome is a FinOps-grade cost management capability for AI, comparable to what mature organisations have built for cloud infrastructure.

Target Audience: CIO, CTO, CFO, Head of FinOps, AI Engineering Lead Time to Implement: 4–8 weeks


2. Problem Statement

Business Problem

AI inference costs can be 10–100x the cost of traditional API calls. A single GPT-4-class LLM call processing a long document can cost $0.10–$1.00. At scale, these costs accumulate rapidly. Yet most organisations cannot answer: Which team is responsible for 40% of our AI spend? Is our cost per resolved customer ticket increasing or decreasing? Which product feature has the worst cost-to-value ratio? Without this visibility, AI investment decisions are made on intuition rather than evidence.

Technical Problem

AI API costs flow through a small number of API keys and appear as aggregate line items in cloud billing. Request-level cost is not native to most AI service invoices; it must be calculated from token counts logged at inference time using a model-specific cost coefficient table. This calculation must happen within the application, not from the billing system, because billing data arrives 24–48 hours late and lacks the context dimensions needed for attribution.

Symptoms

  • Monthly cloud bill has a large "AI/ML" line item with no sub-attribution
  • Teams cannot self-serve their AI cost data; FinOps team manually allocates costs quarterly
  • A new AI feature shipped without a cost cap; usage exceeded budget by 400% in the first week
  • Cost optimisation conversations start with "which model should we use?" but there is no data to inform the decision
  • Business stakeholders cannot relate AI spend to business outcomes; AI ROI questions are unanswerable

Cost of Inaction

  • AI cost overruns are a leading reason for AI project cancellation or de-scoping
  • Without per-team cost visibility, engineering teams have no incentive to optimise AI usage
  • Missed opportunity to identify and implement optimisations that typically reduce AI costs by 30–60%
  • Inability to demonstrate AI ROI to executive sponsors undermines AI investment justification

3. Context

When to Apply

  • Any organisation with AI API spend exceeding $1,000/month
  • Organisations with multiple teams sharing AI infrastructure
  • Before scaling AI features to production (cost forecasting requires historical per-request data)
  • Prerequisite: EAAPL-OBS001 telemetry provides the per-request cost tagging data stream

When NOT to Apply

  • Proof-of-concept with < 30-day lifespan where cost attribution is irrelevant
  • Single-team, single-model, single-use-case deployments where all costs are already attributed

Prerequisites

Prerequisite Required Notes
EAAPL-OBS001 AI Telemetry with costUsd field Required Per-request cost calculation must happen at instrumentation layer
Team/product/use-case taxonomy Required Cost attribution requires organisational context dimensions
Budget approval workflow Required Budgets must be defined before alerts can fire
FinOps team or cloud cost management capability Recommended Owners for unified cloud + AI cost view

Industry Applicability

Industry Applicability Primary Driver
Financial Services High Cost allocation to business units; ROI accountability
Technology / SaaS Critical Multi-tenant cost attribution; per-customer cost of service
Healthcare Medium Cost per clinical decision support interaction
Retail / E-Commerce High Cost per recommendation, cost per customer interaction
Government High Departmental budget accountability; public spend justification
Education Medium Cost per student interaction; institutional budget management

4. Architecture Overview

The AI Cost Observability Architecture is built as a FinOps layer on top of the telemetry infrastructure from EAAPL-OBS001. It transforms per-request cost signals into aggregated cost intelligence across all business dimensions.

Per-Request Cost Tagging Architecture

Cost tagging begins at the instrumentation layer. Every LLM API call is tagged with six context dimensions: team (the engineering team responsible), product (the product or application using AI), use_case (the specific feature or workflow), user_tier (free/paid/enterprise — enabling cost-per-user-segment analysis), model (the specific model called, e.g., gpt-4o, claude-3-5-sonnet), and environment (production/staging/development). These tags are attached to every telemetry record as metadata. The per-request cost is calculated using a cost coefficient table that maps model identifiers to their current USD cost per 1K input tokens and per 1K output tokens. The calculation is: costUsd = (inputTokens / 1000 × inputCostPer1K) + (outputTokens / 1000 × outputCostPer1K). The cost coefficient table is maintained by the AI platform team and updated within 24 hours of any provider pricing change.

Cost Allocation Database

Per-request cost events are streamed to a cost allocation database — a time-series store optimised for aggregation queries. The database schema supports multi-dimensional aggregation: cost by team, by product, by feature, by model, by environment, by time period. The cost allocation database is separate from the general telemetry log store because it requires different query patterns (GROUP BY aggregations over dimensions), different retention (financial records retained 7 years), and different access controls (FinOps and finance team access without requiring access to full AI telemetry).

Budget Alert Engine

Budgets are defined at three levels. Team budgets: each team has a monthly AI spend budget. Alerts fire at 80% consumption (warning) and 100% (breach — page engineering lead). Product budgets: each product or major feature has a daily AI spend cap. If the daily cap is exceeded, an alert fires and optionally triggers automatic rate limiting on that feature. Organisation budget: a weekly organisation-wide AI spend cap. If the weekly cap is reached, an escalation alert fires to CTO and FinOps. Budget configurations are stored in a budget registry with approval workflow — changes require FinOps sign-off.

Cost Anomaly Detection

The cost anomaly engine monitors hourly spend rates and compares them to a rolling 7-day baseline (same hour of day across the past 7 days, to account for intra-day patterns). If the current hourly spend for any team or product exceeds 5x the baseline, a P2 anomaly alert fires. If it exceeds 10x, a P1 alert fires. Anomaly detection operates on the dimension of team × product × model to isolate which combination is driving the spike, enabling faster diagnosis.

Cost-Per-Outcome Metrics

Raw cost numbers are only part of the story. Cost-per-outcome metrics connect AI spend to business value. Three standard cost-per-outcome metrics are defined. Cost per successful query: total AI cost for a query session / number of sessions where the user completed their intended task (measured by task completion signal). Cost per resolved ticket: total AI cost for customer service AI / number of tickets closed without human escalation. Cost per processed document: total AI cost for document processing workflows / number of documents successfully processed. These metrics are configured per use case and tracked on the cost dashboard alongside raw spend.

Optimisation Recommendation Engine

The recommendation engine runs weekly and identifies the top three cost reduction opportunities across the organisation. It analyses four optimisation dimensions. Model right-sizing: for each use case, identifies whether a cheaper model achieves equivalent quality (by comparing output quality metrics between current and cheaper models on the same query distribution). Caching opportunities: identifies high-repetition query patterns where semantic caching could reduce LLM calls. Prompt optimisation: identifies use cases where average prompt token count is significantly above the median (suggesting prompt inefficiency). Batch processing: identifies use cases where requests could be batched and processed asynchronously with a batch-optimised model rather than real-time. The top three recommendations are included in the weekly cost attribution report with estimated savings.


5. Architecture Diagram

ARCHITECTURE DIAGRAM
flowchart TD subgraph Instrumentation["Instrumentation Layer"] A[LLM API Call] B[Cost Calculator] C[Cost Event with Tags] end subgraph Storage["Cost Storage"] D[(Cost Allocation DB)] E[Budget Registry] end subgraph Intelligence["Cost Intelligence"] F[Budget Alert Engine] G[Anomaly Detector] H[Optimisation Engine] end A --> B B -->|tokens x coefficient| C C --> D D --> E D --> F D --> G D --> H F --> I[Alerts + Rate Limits] G --> I H --> J[FinOps Dashboard] D --> J style A fill:#dbeafe,stroke:#3b82f6 style B fill:#f0fdf4,stroke:#22c55e style C fill:#f0fdf4,stroke:#22c55e style D fill:#fef9c3,stroke:#eab308 style E fill:#fef9c3,stroke:#eab308 style F fill:#f0fdf4,stroke:#22c55e style G fill:#f0fdf4,stroke:#22c55e style H fill:#f0fdf4,stroke:#22c55e style I fill:#fee2e2,stroke:#ef4444 style J fill:#d1fae5,stroke:#10b981

6. Components

Component Type Responsibility Technology Options Criticality
Cost Calculator SDK Library Compute requestCostUsd from token counts × cost coefficient table Built into AI Client Wrapper (EAAPL-OBS001); cost table in config or secrets Critical
Cost Coefficient Table Configuration Maps modelId to USD cost per 1K input + output tokens; kept current JSON config file; DynamoDB table; managed centrally by AI platform team Critical
Cost Allocation Database Storage Time-series cost events with full dimension tagging; optimised for aggregation ClickHouse, BigQuery, Redshift, TimescaleDB Critical
Budget Registry Storage Budget definitions with team/product/org hierarchy; approval workflow PostgreSQL; Airtable; custom service High
Budget Alert Engine Stream Processor Monitor budget consumption; fire alerts at thresholds Custom service querying cost DB hourly; Grafana alerting Critical
Cost Anomaly Engine Stream Processor Detect anomalous spend rates vs baseline Streaming job on cost events; hourly batch comparison High
Cost Attribution Dashboard UI Self-serve cost visibility by every dimension Grafana, Metabase, Looker, Tableau, custom React High
Cost-Per-Outcome Calculator Batch Job Join cost events with outcome signals; compute cost efficiency metrics Custom aggregation; Airflow/Prefect scheduled Medium
Optimisation Recommendation Engine Batch Job Weekly analysis of top 3 cost reduction opportunities Custom Python analysis; ML-based right-sizing evaluation Medium
FinOps Integration Integration Unified view of cloud + AI costs AWS Cost and Usage Report + custom join; Azure Cost Management; GCP Billing Export High
Monthly Attribution Report Reporting Automated cost attribution report for business unit owners Scheduled query export + email; Looker scheduled report Medium

7. Data Flow

Primary Flow

Step Actor Action Output
1 Application Code Sets context tags: team, product, use_case, user_tier, model, environment on AI client Tagged AI client context
2 AI Client Wrapper Makes LLM API call; receives response with usage (inputTokens, outputTokens) Token counts
3 Cost Calculator Looks up model cost coefficients; computes costUsd costUsd field added to telemetry record
4 OTel Collector Receives cost event with full tags; batches; forwards to cost allocation DB Cost event in cost DB
5 Budget Consumption Monitor Aggregates current spend by team/product/org; compares to budgets Budget consumption percentages
6 Alert Engine Evaluates thresholds; fires alerts where breached Alerts to PagerDuty/Slack/email
7 Cost Anomaly Engine Computes hourly rate vs baseline; evaluates spike thresholds Anomaly alerts on spikes
8 Attribution Dashboard Aggregates cost by dimension for self-serve reporting Real-time cost dashboards
9 Optimisation Engine (weekly) Analyses cost patterns; generates top 3 recommendations Recommendation report
10 Monthly Report Generates formal attribution report; distributes to team leads and FinOps PDF/CSV report with YoY and MoM trends

Error Flow

Error Scenario Detection Action Recovery
Cost coefficient table missing entry for new model costUsd = null in cost events Alert AI platform team; estimate cost from similar model pending update Add new model within 24 hours; backfill null cost estimates
Cost allocation DB ingestion lag Ingestion lag metric > 30 minutes Alert platform engineering; budget alerts may be delayed Restore ingestion; accept delayed alerts during lag window
Anomaly detection false positive (expected traffic spike) Alert volume spike; engineering investigation finds legitimate traffic increase Suppress alert; adjust baseline with annotated known event Annotate event in calendar; tuned baseline excludes known events
Budget overrun not caught by alert Post-hoc FinOps review finds overspend Root cause alert configuration failure; fix alert rules Remediate; refund or recharge overspend to team budget
Cloud billing API unavailable Unified view shows AI API cost only Alert FinOps; display AI cost only with note Restore when billing API recovers

8. Security Considerations

Authentication: Cost allocation database access requires service account authentication for ingestion. Dashboard access requires SSO with RBAC. API keys for cloud billing APIs stored in secrets manager.

Authorisation: Team-level cost data accessible to that team's leads and members. Cross-team cost visibility requires FinOps or VP-level role. Organisation-level cost data restricted to CTO, CFO, FinOps, and AI platform team.

Secrets Management: Model API keys are tagged cost centres, not stored in cost DB. Cloud provider billing API credentials in secrets manager. Cost DB connection strings in secrets manager.

Data Classification: Cost data is classified as Internal. Aggregated cost-per-team data shared with finance is Confidential. Cost data combined with user behaviour data is Confidential.

Encryption: Cost allocation database encrypted at rest (AES-256) and in transit (TLS 1.3). Monthly attribution reports contain financial data and are encrypted at rest and transmitted over secure channels.

Auditability: All budget definition changes are audited with requester, approver, timestamp. Alert suppression events are logged. FinOps access to cost data is logged.

OWASP LLM Top 10 Coverage

OWASP LLM Risk Cost Observability Control Implementation
LLM01 Prompt Injection Injection attacks may inflate token costs Cost anomaly detection catches injection-driven token spikes
LLM02 Insecure Output Handling Unusually long outputs from injection increase cost Output token anomaly detection in cost spike analysis
LLM03 Training Data Poisoning Out of scope for cost observability Covered by drift detection pattern
LLM04 Model Denial of Service Abusive usage drives cost spikes Cost anomaly detection (10x spike = P1 alert) directly detects DoS cost impact
LLM05 Supply Chain Vulnerabilities Unexpected model switch may change cost profile Model dimension in cost tagging detects unexpected model changes via cost shift
LLM06 Sensitive Information Disclosure Long prompts containing PII inflate cost Cost per user anomaly may surface individual PII-heavy usage
LLM07 Insecure Plugin Design Tool calls add cost beyond base LLM inference Tool call cost tracking dimension
LLM08 Excessive Agency Agentic loops can drive runaway costs Cost per session cap; anomaly detection on per-session cost
LLM09 Overreliance High cost per outcome signals low AI effectiveness Cost-per-outcome metrics directly measure overreliance consequence
LLM10 Model Theft Bulk extraction drives unusual cost patterns Cost per API key anomaly detection

9. Governance Considerations

Responsible AI: Cost observability is a governance prerequisite. Organisations cannot govern AI deployment responsibly if they cannot account for its cost. Cost visibility enables ROI-based decisions about which AI use cases justify continued investment.

Model Risk Management: For regulated financial institutions, AI operational costs are part of the total cost of model ownership. Cost trends feed model lifecycle reviews.

Human Approval: Budget definitions require FinOps and team lead approval. Budget increases above 50% of current budget require VP-level approval. Emergency budget overrides require CFO notification.

Policy: AI cost management policy must define: budget allocation process, alert threshold configuration ownership, response obligations when budgets are breached, cost attribution methodology, and FinOps review cadence.

Traceability: Every cost event is traceable from the business unit cost line to the specific model calls that generated it. This traceability supports internal chargeback, external billing (for AI-as-a-service products), and audit.

Governance Artefacts

Artefact Owner Frequency Format
AI Cost Attribution Report FinOps + AI Platform Monthly Automated PDF/CSV to team leads and finance
Budget Registry with Approval Log FinOps Continuous Version-controlled configuration + approval records
Cost Anomaly Incident Log AI Platform Per incident Linked to incident management system
Optimisation Recommendation Tracker AI Engineering Weekly Recommendation report + implementation tracking
AI ROI Report AI Platform + Finance Quarterly Cost-per-outcome metrics vs business value
Annual FinOps Review CFO + CTO + FinOps Annual Strategic AI investment review

10. Operational Considerations

Monitoring: The cost observability system is itself monitored. Cost allocation DB ingestion health, budget alert engine availability, and attribution dashboard freshness are tracked. Stale cost data (> 2 hours old) triggers an alert.

Logging: Cost allocation DB maintains a write-ahead log. All budget changes and alert actions are in immutable audit log.

Incident Response: A cost incident (P1 = budget exceeded or 10x spike) triggers the AI incident management process (EAAPL-OBS004) with cost incident category. FinOps is notified in addition to engineering on-call.

Disaster Recovery: Cost allocation DB is the highest-durability component. Financial records are replicated across availability zones. Cost events can be replayed from the OTel log store if the cost DB requires restoration.

Capacity Planning: Cost allocation DB grows with inference volume. At 1M requests/day with full dimension tagging, each record is approximately 200 bytes. Daily storage is approximately 200MB; annual is approximately 70GB. Columnar compression reduces this by 80%.

SLO Table

SLO Target Measurement Alert Threshold
Cost data freshness < 30 minutes lag from inference to cost DB Ingestion lag metric > 60 minutes
Budget alert delivery < 10 minutes from threshold breach to alert Alert delivery timestamp > 20 minutes
Attribution dashboard data staleness < 1 hour for dashboards Dashboard data timestamp > 2 hours
Monthly report delivery By 3rd business day of following month Report send timestamp Missed delivery date

Disaster Recovery Table

Component RTO RPO Recovery Approach
Cost Allocation DB 30 minutes 1 hour Replicated DB; restore from OTel replay if needed
Budget Alert Engine 15 minutes N/A (stateless rules) Auto-restart; budgets stored in registry
Attribution Dashboard 60 minutes N/A (read-only) Redeploy from version control
Optimisation Engine 24 hours N/A (weekly batch) Re-run weekly job

11. Cost Considerations

Cost Drivers

Driver Description Relative Cost
Cost allocation DB storage and compute Time-series DB with aggregation; scales with inference volume Medium
FinOps tooling licenses Cloud cost management tools, BI tools for dashboards Medium
Engineering time for cost management Budget reviews, anomaly investigation, optimisation implementation High (human time)
Optimisation recommendation engine Weekly batch compute; relatively small Low

Scaling Risks: Cost allocation DB query performance can degrade with very high cardinality dimensions (per-user cost tracking). Use aggregated user_tier dimension for metrics; per-user cost tracking in logs only.

Optimisations:

  • Partition cost DB by date and team; most queries are recent + team-scoped
  • Materialise common aggregations (daily team cost, hourly anomaly baseline) as scheduled views
  • Use columnar storage format (Parquet in BigQuery/ClickHouse) for 80% storage reduction

Indicative Cost Range

Scale AI Spend Being Tracked/Month Estimated Cost Observability Infrastructure/Month
Small $1,000–$10,000 $200–$500
Medium $10,000–$100,000 $500–$2,000
Large $100,000–$1,000,000 $2,000–$8,000
Enterprise $1,000,000+ $5,000–$20,000 (< 2% of managed spend)

12. Trade-Off Analysis

Approach Comparison

Approach Pros Cons Best For
Per-request cost tagging with full dimension taxonomy Precise attribution; anomaly detection; optimisation recommendations; team accountability Requires instrumentation discipline; cost coefficient table maintenance; additional storage Organisations with multiple teams; enterprise AI spend; mature platform engineering
Cloud billing report only Zero instrumentation effort; authoritative financial data 24–48 hour lag; no per-request detail; no team/product attribution without cost allocation tags in billing Single-team, single-product deployments; quick start
Vendor cost management (e.g., OpenAI usage dashboard, Anthropic console) No infrastructure; per-request data available; some tagging Single-vendor; no cross-model view; no business dimension attribution; no anomaly detection Teams using a single AI provider; early-stage cost monitoring

Architectural Tensions

Tension Description Resolution
Granularity vs. Storage cost Per-request records with 6 dimensions create large storage volumes Use columnar compression; partition by date; aggregate old data
Accuracy vs. Timeliness Per-request calculation is accurate but requires current coefficient table; billing API is authoritative but 24–48h delayed Use per-request for operational monitoring; reconcile with billing monthly
Chargeback vs. Collaboration Per-team cost tracking enables accountability but creates disincentive to share AI infrastructure Use cost visibility for transparency, not punitive chargeback; shared savings from optimisations distributed back to teams
Alerting vs. Alert fatigue Many budget thresholds create alert volume Grade alerts: 80% = informational notification; 100% team = Slack; 100% org = page

13. Failure Modes

Failure Likelihood Impact Detection Recovery
Cost coefficient table out of date Medium Medium (underestimated costs) Reconciliation variance between per-request estimate and billing Update table; recompute estimates
Budget configured too high (no real control) Medium Medium (overspend not caught) Annual budget review; anomaly detection still catches spikes Tighten budgets based on actual spend history
Cost anomaly suppressed as false positive when real Low High (overspend undetected) Anomaly suppression rate metric; FinOps monthly reconciliation Never suppress without investigation; require written rationale
Per-request cost tags missing (untagged requests) Medium Medium (attribution gap) Untagged request count metric; alert if > 5% untagged Enforce tagging in AI client wrapper; block untagged requests in dev
Monthly report automation fails Low Low (manual workaround) Report delivery failure alert Manual report generation; fix automation

Cascading Scenarios

  • Scenario 1: New AI feature deployed without product budget configuration → budget alert never configured → feature goes viral → $50K AI spend in first weekend with no alert → monthly budget exhausted in 3 days. Mitigation: mandatory budget registration as deployment gate; no deployment without budget entry.
  • Scenario 2: Cost coefficient table not updated after model pricing change → actual costs underestimated by 40% → team thinks they're within budget → monthly bill reveals significant overrun. Mitigation: pricing change monitoring (subscribe to provider pricing changelog); automated coefficient table update.

14. Regulatory Considerations

Regulation Clause Requirement AI Cost Observability Implementation
APRA CPS 230 Para 53 (Operational Risk) Technology cost is an operational risk; AI costs qualify Cost observability enables cost risk management and budget control
APRA CPS 230 Para 35 (Service Provider Management) Third-party AI provider costs must be managed and monitored Per-model cost tracking enables provider spend management
Privacy Act 1988 (AU) APP 11 (Security) Cost signals must not be derived in a way that leaks individual PII Aggregated user_tier dimension; not per-user cost tracking
EU AI Act Article 9 (Risk Management) Cost management is part of AI operational risk management Budget controls reduce financial risk of AI deployment
ISO/IEC 42001 Clause 8.5 (AI System Lifecycle) Resource management for AI systems must be documented Cost observability provides the technical foundation for resource management
NIST AI RMF GOVERN 1.1 Organisational policies for AI management including resource allocation Cost attribution and budget management implements resource allocation governance

15. Reference Implementations

AWS

  • Cost Tagging: AI Client Wrapper with context propagation; AWS tags on Bedrock requests
  • Cost DB: Amazon Redshift (columnar); or ClickHouse on EC2 for simpler deployments
  • Budget Alerts: Custom Lambda querying cost DB; CloudWatch Alarms on custom metrics
  • Anomaly Detection: Amazon Lookout for Metrics; custom CloudWatch anomaly detection
  • Attribution Dashboard: Amazon QuickSight with direct query to Redshift
  • FinOps Integration: AWS Cost and Usage Report + custom Athena query joining AI cost data
  • Monthly Report: Scheduled Lambda + SES for report delivery

Azure

  • Cost Tagging: AI Client Wrapper with Azure context propagation
  • Cost DB: Azure Synapse Analytics (Serverless SQL Pool); Azure Data Lake Storage
  • Budget Alerts: Azure Monitor custom metrics; Azure Cost Management Budgets for cloud costs
  • Anomaly Detection: Azure Anomaly Detector API on cost time series
  • Attribution Dashboard: Power BI Direct Query to Synapse
  • FinOps Integration: Azure Cost Management Export + custom join in Synapse
  • Monthly Report: Power BI Scheduled Report delivery

GCP

  • Cost Tagging: AI Client Wrapper; Vertex AI resource labels
  • Cost DB: BigQuery (native columnar, pay-per-query; ideal for cost data)
  • Budget Alerts: BigQuery scheduled queries + Cloud Monitoring custom metrics
  • Anomaly Detection: BigQuery ML anomaly detection on cost time series
  • Attribution Dashboard: Looker with BigQuery
  • FinOps Integration: GCP Billing Export to BigQuery + join with AI cost data
  • Monthly Report: Looker Scheduled Delivery

On-Premises

  • Cost Tagging: AI Client Wrapper
  • Cost DB: ClickHouse (excellent columnar performance; open source)
  • Budget Alerts: Custom Python service polling ClickHouse; Alertmanager for routing
  • Anomaly Detection: Custom Prophet or statsmodels implementation
  • Attribution Dashboard: Grafana with ClickHouse data source
  • FinOps Integration: Manual reconciliation with cloud billing data; custom ETL
  • Monthly Report: Scheduled SQL query + CSV export via email

Pattern ID Pattern Name Relationship Notes
EAAPL-OBS001 AI Telemetry Architecture Foundation costUsd field and dimension tags defined in OBS001; this pattern builds the FinOps layer on top
EAAPL-OBS004 AI Incident Management Depends On Cost incidents (budget breach, anomaly) trigger incident management process
EAAPL-OBS002 Prompt Monitoring Sibling Prompt token length anomalies detected in OBS002 are a cost driver; shared alerting

17. Maturity Assessment

Overall Maturity: Proven

Dimension Score (1–5) Rationale
Adoption Breadth 3 Mature in FinOps-sophisticated organisations; still largely absent in mid-market
Tooling Ecosystem 4 ClickHouse, BigQuery, Grafana, Looker are mature; AI-specific FinOps tools emerging
Operational Runbook Coverage 4 Budget management and cost anomaly runbooks well-established in mature orgs
Regulatory Evidence 3 Not directly mandated; implicit in operational risk management frameworks
Cost Predictability 5 The pattern's primary value: makes AI costs predictable
Team Skill Availability 4 FinOps and data engineering skills broadly available

18. Revision History

Version Date Author Changes
1.0.0 2026-06-12 EAAPL Working Group Initial publication
← Back to LibraryMore Observability & Monitoring