[EAAPL-AGT008] Event-Driven Agent
Category: Agentic AI
Sub-category: Event-Triggered Execution Architecture
Version: 1.2
Maturity: Proven
Tags: event-driven, kafka, eventbridge, webhooks, trigger-evaluation, back-pressure, context-assembly, reactive-agent
Regulatory Relevance: APRA CPS 234, EU AI Act (Art. 9, 14), ISO 42001 §8.4, NIST AI RMF (GOVERN 1.6)
1. Executive Summary
The Event-Driven Agent Pattern defines the architecture for AI agents that are triggered by external events — system alerts, customer actions, data changes, IoT signals, or business process milestones — rather than by direct user requests. Instead of waiting for a human to initiate a task, the event-driven agent subscribes to event streams, evaluates incoming events against trigger conditions, assembles context from the event payload and related data sources, and executes the appropriate agent task autonomously.
For CIO/CTO audiences: this pattern enables your AI workforce to be responsive to your business in real time, not just when humans remember to invoke it. When a customer's credit utilisation crosses a threshold, the agent proactively assembles their financial profile and prepares a targeted intervention. When a production deployment event fires, the agent immediately reviews the diff and post-deployment metrics. When a regulatory filing deadline is approaching, the agent automatically begins document collection and drafting. This is the architectural enabler for proactive AI — agents that respond to business conditions at machine speed, not human speed. The critical governance challenge this pattern solves is ensuring that reactive autonomy does not become uncontrolled autonomy: every trigger has explicit conditions, every event-initiated action is audited, and back-pressure controls prevent event floods from spawning runaway agent farms.
2. Problem Statement
Business Problem
Human operators cannot monitor every data stream, system event, and business condition that should trigger an AI-assisted action. The gap between "an event occurs that warrants action" and "a human notices and delegates the task" is measured in minutes to hours. In high-velocity domains (financial markets, customer churn, security incidents), this gap is where value is lost and risk is accumulated.
Technical Problem
Standard agent invocation patterns are pull-based: a human or calling system must initiate each agent task. There is no infrastructure for agents to subscribe to event streams, evaluate trigger conditions, and self-initiate. Without this, reactive automation requires humans to be in the loop for event detection, which eliminates the latency advantage.
Symptoms of Absence
- Time-sensitive business events (customer churn signals, security anomalies, SLA breaches) are actioned hours after detection
- Operations teams write brittle, bespoke scripts to detect events and invoke agent APIs
- Event floods (e.g., thousands of concurrent alerts) cause ad hoc agent invocation systems to fail without back-pressure controls
- Audit teams cannot reconstruct which events triggered which agent actions without bespoke logging
Cost of Inaction
- Revenue: Customer churn events actioned 2 hours late vs. 2 minutes produce measurably different outcomes
- Risk: Security alerts not actioned within SLA windows create undetected exposure periods
- Operational: Manual event-to-agent wiring creates maintenance overhead that grows with the number of event types
3. Context
When to Apply
- Agent actions should be triggered by data events, not human requests
- Response latency to events matters (seconds to minutes, not hours)
- The organisation has existing event infrastructure (Kafka, EventBridge, webhooks, message queues)
- Event volume can vary significantly; back-pressure handling is required
- Audit trail of which events triggered which agent actions is a compliance requirement
When NOT to Apply
- All agent tasks are human-initiated and interactive
- Event volume is very low (< 10 events/day) — a cron-based or webhook-triggered approach is simpler
- Events are highly structured and actions are deterministic — a rule engine is more appropriate than an AI agent
Prerequisites
- Event infrastructure: Kafka, EventBridge, Azure Event Grid, Google Pub/Sub, or webhook receiver
- Event schema registry for source events
- EAAPL-AGT001 (Single Agent Pattern) for agent execution
- EAAPL-AGT009 (Agent Identity) for event-triggered agent workload identity
- Dead-letter queue infrastructure for failed event processing
Industry Applicability
| Industry |
Event Source |
Trigger Event |
Agent Action |
| Financial Services |
Transaction system |
Unusual transaction pattern |
Fraud triage, customer notification draft |
| Retail |
CRM / Analytics |
Churn propensity score > threshold |
Retention offer preparation |
| Healthcare |
Monitoring system |
Patient vital sign alert |
Clinical summary assembly for on-call nurse |
| Technology / SaaS |
CI/CD pipeline |
Deployment event |
Post-deploy quality check, release notes |
| Security Operations |
SIEM |
Security alert (high severity) |
Incident context assembly, playbook execution |
4. Architecture Overview
The Event-Driven Agent Pattern wraps the standard agent loop (EAAPL-AGT001) with an event-subscription and trigger-evaluation layer. The key architectural insight is that not every event warrants agent invocation: events must be evaluated against trigger conditions before agent resources are committed.
Event Ingestion and Normalisation
The Event Ingestion Layer subscribes to one or more event sources (Kafka topics, EventBridge rules, webhook endpoints). Incoming events are normalised into a standard internal event format: event_id, event_type, source_system, timestamp, payload (structured), and entity_references (IDs of related entities whose data may be needed for context assembly). Normalisation decouples the agent infrastructure from the idiosyncratic schemas of source systems.
Trigger Evaluation
The Trigger Evaluator is a lightweight, low-latency component that evaluates each normalised event against a library of trigger rules. A trigger rule defines: the event type(s) that activate it, the conditions on the event payload (e.g., amount > 50000, severity == "HIGH"), any deduplication logic (suppress duplicate triggers for the same entity within a time window), and the target agent task type and priority. Trigger evaluation must be fast (target: < 5ms) because it runs on every event; it must not invoke the LLM. If no trigger rule matches, the event is dropped (or archived for audit). If a trigger matches, a task record is created and queued.
Context Assembly
When a task is queued from an event trigger, the Context Assembler enriches the triggering event with data from related systems. This enrichment is the most important step: the LLM cannot act on a bare event payload alone. For a credit utilisation event, the context might include: the customer's full credit profile, recent transaction history (last 30 days), prior intervention history, and the organisation's intervention policy guidelines. The Context Assembler retrieves this data using pre-defined data access patterns specified in the trigger rule configuration. Context assembly is an async, potentially multi-source operation that completes before the agent loop begins.
Back-Pressure and Rate Limiting
Event floods are a real failure mode in event-driven systems. If a system generates 10,000 alert events per minute (e.g., a monitoring system during a major incident), naively spawning 10,000 agent tasks would exhaust LLM quotas, compute, and downstream API capacity simultaneously. The Back-Pressure Controller enforces: maximum concurrent agent tasks per trigger type, per entity (deduplicate: one active task per customer/entity per trigger type), and globally. Excess events are held in a priority queue (high-priority events pre-empt lower-priority ones) and processed as capacity becomes available. A circuit breaker detects event flood conditions and can switch to a degraded mode (e.g., trigger evaluation only; no agent execution; alert the operations team).
Event-to-Action Audit
Every event-triggered agent invocation creates an immutable audit record linking: the source event (event_id, source_system, timestamp), the trigger rule that matched, the task created, and the final agent action. This bidirectional linkage enables compliance teams to answer "what event caused this agent action?" and "what did the agent do in response to this event?" — both questions that arise regularly in financial services and security operations regulatory reviews.
5. Architecture Diagram
flowchart TD
subgraph Input["Event Sources"]
A[Kafka / EventBridge]
B[Webhooks / CDC]
end
subgraph Core["Trigger and Context Core"]
C[Event Normaliser]
D{Trigger Evaluator}
E[Context Assembler]
F[Back-Pressure Controller]
end
subgraph Execution["Agent Execution"]
G[Agent Loop]
H[(Audit Log)]
end
A --> C
B --> C
C --> D
D -->|no match| H
D -->|match| F
F -->|within capacity| E
F -->|flood| H
E --> G
G --> H
style A fill:#dbeafe,stroke:#3b82f6
style B fill:#dbeafe,stroke:#3b82f6
style C fill:#f0fdf4,stroke:#22c55e
style D fill:#f3e8ff,stroke:#a855f7
style E fill:#f0fdf4,stroke:#22c55e
style F fill:#f0fdf4,stroke:#22c55e
style G fill:#d1fae5,stroke:#10b981
style H fill:#fef9c3,stroke:#eab308
6. Components
| Component |
Type |
Responsibility |
Technology Options |
Criticality |
| Event Ingestion Layer |
Integration |
Subscribes to event sources; receives webhooks; provides unified event stream |
Kafka Consumer, EventBridge Rule, AWS Lambda URL, Azure Event Grid webhook |
Critical |
| Event Normaliser |
Data Transform |
Maps source-specific schemas to internal standard event format |
Custom transform functions; Apache Camel; AWS EventBridge transform; Schema Registry |
High |
| Event Schema Registry |
Data Governance |
Stores schemas for all event types; validates incoming events |
Confluent Schema Registry, AWS Glue Schema Registry, Azure Schema Registry |
High |
| Trigger Evaluator |
Logic Engine |
Evaluates normalised events against trigger rule library; < 5ms target |
Custom rule engine; OPA; Spring Event system; AWS EventBridge rules |
Critical |
| Trigger Rule Library |
Configuration Store |
Stores trigger rule definitions: event_type, conditions, target_task, priority |
PostgreSQL, DynamoDB, Redis (hot rules) |
Critical |
| Deduplication Cache |
State |
Prevents duplicate trigger fires for same entity within time window |
Redis (TTL-keyed); DynamoDB with TTL |
High |
| Back-Pressure Controller |
Rate Limiting |
Enforces concurrency limits; manages priority queue; detects floods |
Custom + semaphore; Kafka consumer group lag monitoring |
Critical |
| Priority Queue |
Buffering |
Holds excess triggered tasks ordered by priority |
SQS FIFO, Azure Service Bus (sessions), Kafka topic with priority consumer groups |
High |
| Circuit Breaker |
Resilience |
Detects event flood; switches to degraded mode; alerts |
Resilience4j, Polly, custom; integrates with alerting |
High |
| Context Assembler |
Data Enrichment |
Retrieves context data from multiple sources per trigger rule's data access patterns |
Custom orchestration; Apache Camel; custom async multi-source fetch |
High |
| Agent Execution |
AI Worker |
Executes enriched task via standard agent loop |
EAAPL-AGT001 implementation |
Critical |
| Event-to-Action Audit Log |
Compliance |
Immutable bidirectional link between source event and agent action |
WORM store; S3 Object Lock; Azure Immutable Blob |
Critical |
| Notification / Downstream Events |
Integration |
Publishes agent action results to downstream systems and stakeholders |
Kafka, EventBridge, SNS, Teams/Slack webhooks |
Medium |
7. Data Flow
Standard Event-to-Action Flow
| Step |
Actor |
Action |
Output |
| 1 |
Source System |
Emits event: {event_id: "evt-123", type: "credit.utilisation.high", payload: {customer_id: "cust-456", utilisation: 0.92}, timestamp} |
Raw event on Kafka/EventBridge |
| 2 |
Event Ingestion |
Receives raw event; applies schema validation |
Validated raw event |
| 3 |
Event Normaliser |
Maps to internal standard format; extracts entity_references: [customer_id: "cust-456"] |
Normalised event |
| 4 |
Trigger Evaluator |
Evaluates against trigger rules; matches rule "CREDIT_UTILISATION_INTERVENTION": utilisation > 0.90; dedup check: no active task for customer-456 in last 24h |
Trigger match: task_type=credit_intervention, priority=HIGH |
| 5 |
Back-Pressure Controller |
Checks concurrent task count: 3 running, limit 10; permits |
Task admitted |
| 6 |
Context Assembler |
Fetches: CRM profile for cust-456, last 30 days transactions, prior intervention history, intervention policy guidelines |
Enriched context package |
| 7 |
Agent Execution |
Executes credit intervention agent: analyses profile, drafts personalised offer or recommendation, determines best channel |
Agent action result |
| 8 |
Result Handler |
Routes action result: if draft approved by agent → queues for human review or direct action per policy |
Result routed |
| 9 |
Audit Log |
Writes: {source_event_id: "evt-123", trigger_rule_id: "CREDIT_UTILISATION_INTERVENTION", task_id: "task-789", agent_action: "draft_prepared", timestamp} |
Audit record |
Event Flood Scenario
| Step |
Actor |
Action |
Output |
| 1 |
Monitoring System |
Emits 5,000 HIGH alerts in 60 seconds (major incident) |
Alert flood |
| 2 |
Back-Pressure Controller |
Running tasks = 10/10 capacity; sends events to priority queue |
Priority queue backlog |
| 3 |
Circuit Breaker |
Queue depth > circuit breaker threshold (e.g., 500); activates degraded mode |
Alert: "Event flood detected — agent execution paused" |
| 4 |
Operations Team |
Receives alert; reviews; decides to increase capacity or triage manually |
Manual decision |
| 5 |
Recovery |
Queue drains as incident resolves; circuit breaker resets; normal processing resumes |
Normal operation restored |
Error Flow
| Error |
Detection |
Recovery |
| Context assembly fails (data source unavailable) |
Timeout / exception in assembler |
Retry with backoff; if assembly fails after N retries, send to DLQ with error context; alert |
| Agent execution fails |
Task failure |
Retry from checkpoint; DLQ after max retries |
| Trigger rule misconfigured (floods agent) |
Rapid task creation for single trigger_rule_id |
Alert; disable rule; manual review |
| Duplicate event processed (idempotency) |
Deduplication cache miss (cache expired) |
Idempotency key on task creation; ON CONFLICT DO NOTHING on task record |
8. Security Considerations
Event Source Trust
- Not all event sources are equally trusted; events from external webhooks are lowest trust and must be authenticated (HMAC signature verification, API key, OAuth)
- Internal events from trusted systems (Kafka with ACLs, EventBridge with resource policies) receive higher trust but are still validated against the schema registry
- Trigger conditions must not be derivable from public information that an attacker could control to trigger privileged agent actions
Audit Linkage
- The bidirectional event-to-action audit log is the forensic foundation for investigating "why did the agent do that?" — it must be tamper-proof and queryable
OWASP LLM Top 10
| OWASP LLM Risk |
Event-Driven Applicability |
Mitigation |
| LLM01 Prompt Injection |
Event payload content could contain injected instructions |
Payload sanitisation before context assembly; event content is data, never instructions in the system prompt |
| LLM08 Excessive Agency |
Autonomous event-triggered execution means no human initiates the action — agent acts entirely on its own |
Trigger rules are the human-approved scope boundary; PRIVILEGED actions require human check-in gate before execution; back-pressure limits concurrent autonomous actions; all actions audited in event-to-action log |
| LLM04 DoS |
Event floods spawn unbounded agent tasks |
Back-pressure controller; circuit breaker; concurrency limits |
| LLM06 Sensitive Information Disclosure |
Context assembly retrieves sensitive data from multiple sources |
Context classification; data minimisation in assembly (only fetch what trigger rule specifies); encryption in transit and at rest |
9. Governance Considerations
Trigger Rule Governance
- Every trigger rule must be approved by the business owner of the source event and the owner of the target agent action before activation
- Trigger rules are versioned; changes require change management review with impact assessment
- Quarterly review of all active trigger rules: is each rule still producing value? Are any rules causing unintended agent actions?
Governance Artefacts
| Artefact |
Owner |
Frequency |
Purpose |
| Trigger Rule Register |
Platform Governance |
Continuous (auto-generated) |
Complete inventory of active trigger rules, their owners, and approval status |
| Event-to-Action Audit Report |
Compliance |
Monthly |
Summary of all event-triggered actions; anomaly flags; high-priority event handling |
| Back-Pressure Incident Log |
Operations |
Per incident |
Record of capacity limit events, circuit breaker activations, manual interventions |
| Trigger Rule Change Log |
Platform Engineering |
Per change |
Versioned record of all trigger rule additions, modifications, and deactivations |
10. Operational Considerations
SLOs
| SLO |
Target |
Window |
Alert |
| Event-to-task-creation latency (HIGH priority) |
≤ 5 seconds |
1-hour rolling |
> 30 seconds triggers P1 |
| Context assembly latency |
≤ 10 seconds p95 |
1-hour rolling |
> 30 seconds triggers P2 |
| Event-to-action audit completeness |
100% |
Daily reconciliation |
Any gap triggers P0 |
| Dead-letter queue depth |
≤ 100 events |
Continuous |
> 500 triggers P2 |
| Circuit breaker activation frequency |
< 1 per week |
Weekly |
Any activation triggers operational review |
Monitoring
- Event ingestion rate per topic/source: detect floods, silences (source failure), and schema drift
- Trigger match rate per rule: sustained 0% match rate may indicate rule misconfiguration or source schema change
- Context assembly success rate per data source: low rate indicates data source degradation
- Agent task queue depth: early warning of processing backlogs
11. Cost Considerations
Cost Drivers
| Cost Driver |
Scaling Behaviour |
Control |
| LLM inference |
Linear with triggered task count |
Trigger condition tuning to reduce false positives; smaller model for triage |
| Context assembly API calls |
Linear with triggered tasks × data sources per rule |
Data source caching; context TTL for recently assembled contexts |
| Event infrastructure |
Linear with event volume |
Kafka/EventBridge pricing is per-event; filter at source where possible |
| Agent compute |
Linear with concurrent tasks |
Back-pressure limits; auto-scaling with ceiling |
Indicative Cost Range (per 10,000 triggered events/day)
| Trigger Match Rate |
Events Processed |
LLM Cost/Day |
Notes |
| 5% (well-tuned rules) |
500 agent tasks |
$5–50 |
Low false positive rate |
| 20% (moderate precision) |
2,000 agent tasks |
$20–200 |
Review trigger conditions |
| 60% (over-broad rules) |
6,000 agent tasks |
$60–600 |
Urgent trigger rule review required |
12. Trade-Off Analysis
Event Trigger Architecture Options
| Option |
Description |
Pros |
Cons |
Best For |
| A: Kafka + Custom Trigger Evaluator (Recommended) |
Agent subscribes to Kafka; custom rule engine evaluates triggers |
Maximum flexibility; highest throughput; full control |
Engineering complexity; requires Kafka expertise |
High-volume, high-precision enterprise deployments |
| B: EventBridge / Event Grid Rules |
Cloud-native event routing with built-in filter rules |
Low ops overhead; managed service; native cloud integration |
Limited condition expressiveness; vendor lock-in |
Cloud-native organisations; moderate event volumes |
| C: Webhook + Synchronous Trigger |
Source systems call agent webhook directly |
Simple; low latency; easy to reason about |
Source system coupling; no back-pressure; no deduplication |
Simple use cases; trusted internal sources only |
| D: Cron-Polling (Pseudo Event-Driven) |
Agent polls data sources on schedule; detects state changes |
No event infrastructure required |
Higher latency; polling cost; doesn't scale to high event volumes |
Legacy systems without event emission capability |
Architectural Tensions
| Tension |
Left Pole |
Right Pole |
Balance |
| Trigger sensitivity vs. Agent cost |
Broad triggers: catch everything; high recall |
Narrow triggers: only high-confidence events; low cost |
Start narrow; broaden based on measured false negative rate |
| Context richness vs. Assembly latency |
Fetch all potentially relevant data |
Fetch minimal context for speed |
Pre-specify context packages per trigger rule type; cache frequently accessed data |
| Real-time response vs. Human oversight |
Fully autonomous event response; maximum speed |
Human approves every event-triggered action; maximum oversight |
Risk-tiered: LOW impact actions → autonomous; HIGH/IRREVERSIBLE → human approval gate |
13. Failure Modes
| Failure Mode |
Likelihood |
Impact |
Detection |
Recovery |
| Trigger rule fires on malicious/injected event payload |
Low |
Critical — agent takes unintended action |
Event source authentication; payload sanitisation; anomaly detection on trigger rate |
Suspend trigger rule; investigate; audit all actions triggered by the rule |
| Context assembly returns stale data (cache hit on outdated profile) |
Medium |
Medium — agent acts on old information |
Cache TTL tuning; freshness check |
Short TTL on sensitive context; force-refresh option in trigger rule |
| Event flood crashes context assembler (data source overload) |
Low |
High — agent tasks fail |
Context assembly error rate; data source health check |
Circuit breaker on context assembly; queuing; data source auto-scaling |
| Back-pressure queue grows unboundedly |
Low |
High — memory pressure; delayed actions |
Queue depth monitoring |
Expiry policy on queued tasks; drop oldest if queue exceeds ceiling |
| Trigger rule produces duplicate actions for same entity |
Medium |
Medium — spam, duplicate data |
Deduplication window cache |
Tune dedup window TTL per trigger type |
14. Regulatory Considerations
APRA CPS 234
- Event-driven agents that access or modify customer data are subject to CPS 234 information security requirements
- The event-to-action audit log is a CPS 234-mandated information security record for automated system actions
EU AI Act
- Art. 14 (Human Oversight): for high-risk event-triggered actions, a human approval gate between trigger and execution is required; the priority queue provides a hold point for this gate
- For GPAI models used in event-driven agents: the trigger rule library's scope constraints are the documented "intended use" limitation
NIST AI RMF
- GOVERN 1.6: the trigger rule register documents the intended use and scope constraints for each event-triggered agent behaviour
15. Reference Implementations
AWS
| Component |
Service |
| Event Source |
Amazon Kafka (MSK), Amazon EventBridge |
| Trigger Evaluation |
AWS Lambda (EventBridge Rule filter + Lambda processor) |
| Back-Pressure |
SQS FIFO queue + Lambda reserved concurrency |
| Context Assembly |
Lambda + DynamoDB + RDS + S3 |
| Agent Execution |
ECS Fargate or Bedrock Agents |
| Audit Log |
DynamoDB (event-to-action index) + S3 Object Lock |
Azure
| Component |
Service |
| Event Source |
Azure Event Hubs (Kafka-compatible), Azure Event Grid |
| Trigger Evaluation |
Azure Functions (Event Grid trigger) |
| Back-Pressure |
Azure Service Bus (sessions) + Functions auto-scaling |
| Agent Execution |
Azure Container Apps |
On-Premises
| Component |
Technology |
| Event Source |
Apache Kafka (self-managed) |
| Trigger Evaluation |
Kafka Streams custom app |
| Back-Pressure |
Custom semaphore + Redis queue |
| Agent Execution |
Kubernetes Jobs |
| Pattern |
ID |
Relationship Type |
Notes |
| Single Agent Pattern |
EAAPL-AGT001 |
Extended By |
Agent execution is the core of the event response |
| Agent Identity and Authorisation |
EAAPL-AGT009 |
Depends On |
Event-triggered agents need workload identity for context data access |
| Agent Cost Governance |
EAAPL-AGT010 |
Integrates With |
Burst event handling creates cost spikes; cost governance provides budget controls |
| Human-in-the-Loop Agent |
EAAPL-MAG003 |
Peer |
High-risk event-triggered actions route through human approval before execution |
| Agent Checkpoint and Recovery |
EAAPL-AGT005 |
Integrates With |
Long context-assembly steps should be checkpointed for recovery |
17. Maturity Assessment
Overall Maturity: Proven
| Dimension |
Score (1–5) |
Evidence |
| Event Infrastructure Maturity |
5 |
Kafka, EventBridge, Event Grid are production-proven at hyperscale |
| Trigger Evaluation Patterns |
4 |
Rule engines and CEP (Complex Event Processing) patterns well-established |
| AI-Specific Back-Pressure |
3 |
LLM-specific concurrency limiting patterns still maturing; general queue patterns apply |
| Context Assembly Tooling |
3 |
Multi-source enrichment patterns maturing; LLM-native context assembly tooling limited |
| Audit Linkage Standards |
3 |
Event-to-action audit patterns exist but no standard schema; custom implementation required |
18. Revision History
| Version |
Date |
Author |
Changes |
| 1.0 |
2024-05-01 |
Architecture Board |
Initial publication |
| 1.1 |
2024-09-20 |
Platform Engineering |
Added circuit breaker pattern; event flood scenario in data flow |
| 1.2 |
2025-02-28 |
Architecture Board |
Added OWASP LLM08 detail for event-triggered excessive agency; EU AI Act Art. 14 trigger gate pattern |