[EAAPL-WRK004] Router / Dispatcher
Category: Agentic Workflows
Sub-category: Intelligent Routing Architecture
Version: 1.0
Maturity: Mature
Tags: routing, classifier, intent-detection, dispatcher, specialist-agents, fallback
Regulatory Relevance: ISO 42001 §8.4, APRA CPS 230, EU AI Act (Art. 13)
1. Executive Summary
The Router / Dispatcher Pattern defines a central classifier that receives all incoming requests, determines their intent or type, and dispatches them to the appropriate specialist agent or processing chain. By separating routing logic from processing logic, this pattern enables a single entry point to handle diverse request types without any single processing chain becoming a bloated, multi-purpose monolith. The router acts as the traffic controller for a fleet of specialist workers.
For CIO/CTO audiences: rather than building one AI system that tries to handle every type of request, you build specialist handlers — each excellent at one task type — and a smart front door (the router) that sends each request to the right specialist. This mirrors how professional service firms operate: a legal query goes to the legal team, a financial query to finance, a compliance query to compliance. The difference is this routing happens in milliseconds, is transparent and auditable, and scales to millions of requests. The router is also the enforcement point for access control: it can block certain request types from reaching certain specialists based on user permissions.
2. Problem Statement
Business Problem
Enterprise AI systems receive diverse request types from users: policy questions, document generation, data retrieval, calculations, compliance queries. Routing all of these through a single general-purpose agent produces mediocre results across all categories because the agent must balance competing prompt instructions and cannot be deeply optimised for any single task type.
Technical Problem
A monolithic agent's prompt grows unbounded as new task types are added, reducing per-task quality. There is no mechanism to route high-confidence requests to fast, cheap handlers and complex requests to expensive, capable handlers. Access control cannot be enforced at the task-type level.
Symptoms of Absence
- Single large prompt trying to handle all request types; quality average across all types rather than excellent for any
- No cost optimisation by request type (all requests use the same expensive model)
- Access control is coarse-grained; cannot restrict certain task types to authorised users
- Adding a new task type requires modifying the monolithic prompt, risking regressions
Cost of Inaction
- Quality: General-purpose handling produces lower quality than specialist handling across all domains
- Cost: No ability to route simple requests to cheaper models
- Maintainability: Monolithic handlers accumulate complexity and become unmaintainable
3. Context
When to Apply
- Incoming request corpus is heterogeneous: multiple distinct task types with different optimal handling
- Different task types have different model requirements (cost, capability, latency)
- Access control at the task-type level is required
- Specialist handlers already exist or can be built per task type
When NOT to Apply
- All requests are of the same type (no routing needed)
- Request types are not predictable upfront; types emerge from reasoning (use ReAct, EAAPL-WRK001)
- Task type cannot be reliably classified from the request alone (low classifier confidence)
- Only one or two task types with similar handling requirements
Prerequisites
- Inventory of request types and their handling requirements
- Confidence threshold policy (what to do when classifier is uncertain)
- Fallback handler for unclassified or low-confidence requests
- Specialist handlers (agents, chains) for each defined route
Industry Applicability
| Industry |
Router Use Case |
Route Categories |
| Financial Services |
Customer contact centre AI |
Account query, dispute, product info, complaint, transfer |
| Legal |
Document management AI |
Contract review, research query, matter update, precedent search |
| Government |
Citizen services portal |
Benefit query, application status, regulatory question, escalation |
| Healthcare |
Clinical decision support |
Drug interaction check, guideline lookup, referral request, documentation |
| Technology |
Developer productivity AI |
Code generation, bug diagnosis, architecture question, documentation |
4. Architecture Overview
The Router/Dispatcher has three logical layers: intake, classification, and dispatch.
Intake Layer
The intake layer receives incoming requests, applies pre-processing (normalisation, PII detection, rate limiting, authentication context attachment), and prepares the request for classification. The intake layer is the first security enforcement point: requests from unauthenticated callers or blocked IP ranges are rejected before reaching the classifier.
Classification Layer
The classifier receives the pre-processed request and produces a route decision: a route label (e.g., "contract_review", "policy_query", "data_extraction") and a confidence score. Classification can be implemented at multiple levels of sophistication:
- Rule-based: Keyword matching and regex patterns; fast, deterministic, zero LLM cost; suitable for well-defined, structured request types
- Embedding-based: Semantic similarity to route prototype embeddings; handles paraphrase variations; low latency, low cost
- LLM-based: Full LLM classification prompt; handles complex, ambiguous requests; higher accuracy, higher latency and cost
- Hybrid (Recommended): Rule-based fast path for high-confidence cases; LLM classifier for uncertain cases
Confidence Thresholding
The classifier produces a confidence score. If confidence exceeds the high-confidence threshold, the request is dispatched immediately. If confidence falls between low and high thresholds, the request is dispatched but flagged for monitoring. Below the low threshold, the request is routed to the fallback handler or escalated for human clarification.
Dispatch Layer
The dispatcher routes the classified request to the appropriate specialist handler with the full request context and route metadata. The specialist handler is invoked asynchronously; the dispatcher manages timeout, retry, and failure handling for the handler invocation.
Fallback Handler
The fallback handler receives all low-confidence, unclassified, or handler-failure requests. It uses a capable general-purpose model to attempt to respond, clearly flagging to the caller that the specialised handling path was not used. Fallback responses are logged for use in improving the classifier and expanding the route inventory.
5. Architecture Diagram
flowchart TD
subgraph Intake["Intake Layer"]
A[Incoming Request]
B[Pre-processing]
end
subgraph Classification["Classification Layer"]
C[Rule-Based Classifier]
D{Confidence Threshold}
E[LLM Classifier]
D2{Route Decision}
end
subgraph Dispatch["Dispatch Layer"]
F[Specialist Handler A]
G[Specialist Handler B]
H[Specialist Handler C]
I[Specialist Handler N]
J[Fallback Handler]
end
subgraph Output["Output"]
K[Response + Metadata]
L[Audit Log]
end
A --> B
B --> C
C --> D
D -->|high confidence| D2
D -->|uncertain| E
E --> D2
D2 -->|route A| F
D2 -->|route B| G
D2 -->|route C| H
D2 -->|route N| I
D2 -->|fallback| J
F & G & H & I & J --> K
K --> L
6. Components
| Component |
Type |
Responsibility |
Technology Options |
Criticality |
| Intake Pre-processor |
Security + Logic |
Auth check, rate limiting, PII detection, normalisation |
Custom middleware; AWS WAF; Azure API Management |
Critical |
| Rule-Based Classifier |
Logic Component |
Fast path classification via regex, keyword, schema matching |
Custom Python; regex engine; JSONPath |
High |
| LLM Classifier |
AI Component |
Classifies ambiguous requests using prompt-based intent detection |
GPT-4o-mini, Claude 3 Haiku (cost-optimised); structured output |
Critical |
| Confidence Gate |
Logic Component |
Applies threshold policy; routes to fast/slow/fallback path |
Custom threshold logic; configurable per route |
Critical |
| Dispatch Router |
Orchestration |
Routes classified request to appropriate specialist handler |
Custom; API Gateway routing; LangChain router chain |
Critical |
| Specialist Handlers |
AI Components |
Dedicated agents/chains per task type |
EAAPL-WRK001/002 implementations per route |
Critical |
| Fallback Handler |
AI Component |
General-purpose handler for unclassified requests |
GPT-4o; Claude 3.5 Sonnet |
High |
| Access Control Enforcer |
Security |
Checks user permissions against route requirements |
RBAC via identity provider; custom permission matrix |
Critical |
| Route Audit Logger |
Governance |
Records every classification decision and route taken |
PostgreSQL; CloudWatch Logs; Splunk |
High |
7. Data Flow
| Step |
Actor |
Action |
Output |
| 1 |
User |
Submits request: "What are our obligations under APRA CPS 234 for cloud services?" |
Raw request + user auth context |
| 2 |
Intake Pre-processor |
Validates auth, checks rate limit, detects no PII, normalises text |
Prepared request with user role: "compliance_analyst" |
| 3 |
Rule-Based Classifier |
Matches keywords: "APRA", "CPS 234", "obligations" → candidate route: "regulatory_compliance" |
Confidence: 0.91 — high confidence |
| 4 |
Confidence Gate |
0.91 > high_confidence_threshold (0.85) → fast path |
Route: "regulatory_compliance"; skip LLM classifier |
| 5 |
Access Control |
User role "compliance_analyst" has access to "regulatory_compliance" route |
Access granted |
| 6 |
Dispatch Router |
Invokes Regulatory Compliance Specialist Handler |
Handler invoked with request + route context |
| 7 |
Specialist Handler |
Executes compliance Q&A chain (EAAPL-WRK002) |
Structured answer with regulatory references |
| 8 |
Audit Logger |
Records: timestamp, user_id, route, confidence, handler_version, latency |
Audit record persisted |
| 9 |
Caller |
Receives response with route metadata |
Response + {route: "regulatory_compliance", confidence: 0.91, handler: "v2.3"} |
Error Flow
| Error |
Detection |
Recovery |
| Classifier returns confidence below minimum threshold |
Confidence Gate |
Route to fallback handler; flag response |
| Specialist handler timeout |
Dispatch Router timeout |
Retry once; if timeout again, route to fallback with status: handler_timeout |
| Access denied (user lacks route permission) |
Access Control |
Return 403 with clear message; log access attempt |
| No matching route (new request type) |
Route matching failure |
Fallback handler; log for route inventory expansion |
8. Security Considerations
The Router as Security Enforcement Point
- Every request passes through the router; it is the single enforcement point for access control, rate limiting, and content filtering
- Misconfiguration of the router (e.g., a missing route permission check) can expose specialist handlers to unauthorised access
OWASP LLM Top 10
| OWASP LLM Risk |
Router/Dispatcher Applicability |
Mitigation |
| LLM01 Prompt Injection |
User request is untrusted input to classifier |
Input sanitisation before classification; delimit user content in classifier prompt |
| LLM08 Excessive Agency |
Router dispatches to powerful specialist handlers |
Access control per route; least-privilege: only dispatch to handlers the user is authorised for |
| LLM04 Model DoS |
High request volume floods classifier |
Rate limiting at intake; async queue for burst absorption |
| LLM07 Insecure Plugin Design |
Specialist handlers are effectively plugins |
Handler permission scoping: each handler only has access to its required tools/data |
9. Governance Considerations
Route Inventory Management
- Every route must be documented: route name, description, handling handler, access control requirements, confidence threshold
- New routes require architecture board review before deployment
- Retired routes must be removed from the route inventory and redirected to fallback
Governance Artefacts
| Artefact |
Owner |
Frequency |
Purpose |
| Route Inventory Register |
AI Platform |
On change; quarterly review |
Canonical list of all routes, handlers, thresholds, and access control |
| Classifier Performance Report |
ML Engineering |
Monthly |
Per-route precision/recall; identifies misclassification hot spots |
| Fallback Usage Report |
AI Operations |
Weekly |
Tracks requests falling to fallback; drives route expansion decisions |
| Access Control Policy |
Security |
Quarterly |
Route-to-user-role permission matrix |
10. Operational Considerations
SLOs
| SLO |
Target |
Window |
Alert |
| Classification latency p95 (fast path) |
≤ 50ms |
1-hour rolling |
> 200ms triggers P2; check rule engine performance |
| Classification latency p95 (LLM path) |
≤ 2s |
1-hour rolling |
> 5s triggers P2 |
| Route accuracy (correct route for known request type) |
≥ 97% |
Weekly eval |
< 94% triggers P2; classifier retraining |
| Fallback rate (requests not classified to a specialist route) |
≤ 5% |
24-hour rolling |
> 10% triggers P3; expand route inventory |
Monitoring
- Per-route request volume trending: route distribution shifts indicate changing user behaviour
- Classifier confidence distribution: degrading confidence distribution indicates prompt drift
- Fallback handler response quality: monitor for quality degradation in fallback responses
11. Cost Considerations
| Classification Path |
Latency |
Cost per Request |
Suitability |
| Rule-based only (fast path) |
< 10ms |
$0.000 |
High-confidence, well-defined request types |
| Embedding similarity |
50–200ms |
~$0.0001 |
Moderate variety; semantic matching needed |
| LLM classifier (small model) |
500ms–2s |
$0.001–0.005 |
Complex, ambiguous request classification |
| LLM classifier (large model) |
2–5s |
$0.01–0.05 |
Reserve for multi-label or novel request types |
Optimisations
- Implement rule-based fast path for the highest-volume, most predictable routes (typically 60–80% of traffic)
- Use smallest capable model for LLM classification (GPT-4o-mini, Claude 3 Haiku)
- Cache classification results for identical or near-identical requests (embedding similarity cache)
12. Trade-Off Analysis
| Option |
Accuracy |
Latency |
Cost |
Flexibility |
Best For |
| A: Hybrid rule + LLM classifier (Recommended) |
High |
Low–Medium |
Low |
High |
Production; mixed traffic |
| B: Pure rule-based classifier |
Medium |
Very Low |
Zero |
Low |
Very structured, predictable requests |
| C: Embedding similarity classifier |
High |
Low |
Very Low |
High |
Semantic variety without LLM cost |
| D: LLM-only classifier |
Very High |
Medium |
Medium |
Very High |
Novel or complex request taxonomy |
Architectural Tensions
| Tension |
Left Pole |
Right Pole |
Balance |
| Classifier specificity vs. Coverage |
Many narrow routes (each handled excellently) |
Few broad routes (generalist handlers) |
5–10 well-defined routes with a strong fallback |
| Classification speed vs. Accuracy |
Fast rule-based (may misclassify edge cases) |
Slow LLM-based (high accuracy, higher latency) |
Hybrid: fast path for 70%+ of traffic |
| Access control granularity vs. Usability |
Route-level ACL (fine-grained, complex) |
System-level ACL (simple, coarse) |
Route-level for regulated/sensitive routes; system-level for standard routes |
13. Failure Modes
| Failure Mode |
Likelihood |
Impact |
Detection |
Recovery |
| Classifier misroutes request to wrong specialist |
Medium |
Medium — poor quality response from wrong handler |
Per-route user feedback; blind evaluation |
Confidence threshold tuning; add rule for misclassified pattern |
| Specialist handler unavailable (all routes) |
Low |
Critical — all requests fail |
Health check monitoring; circuit breaker |
Fallback handler absorbs all traffic; alert |
| Access control bypass (misconfigured route permission) |
Low |
Critical — unauthorised access to sensitive handler |
Security scanning; access log anomaly detection |
Immediate route disable; incident response |
| Fallback handler overwhelmed |
Medium |
Medium — fallback quality degrades under load |
Fallback error rate monitoring |
Rate limit fallback; queue overflow; reject with retry-after |
| Route inventory stale (new request types exceed coverage) |
Medium |
Medium — high fallback rate |
Fallback rate trending |
Regular route expansion review driven by fallback log analysis |
14. Regulatory Considerations
EU AI Act
- Art. 13 (Transparency): The route taken for each request must be logged and available for audit. The route metadata (route label, confidence, handler version) is the transparency artefact.
- For high-risk AI systems, the classifier decision must be explainable — embedding-based and LLM-based classifiers must produce the top contributing features or reasoning for the route decision.
ISO 42001
- §8.4: The route inventory is an operational specification artefact; changes require change management.
Australian Context
- APRA CPS 230: For material business processes routed through an AI dispatcher, the route configuration and audit log are operational resilience evidence.
- OAIC automated decision guidance: The route taken is part of the automated decision trail and must be retainable for subject access requests.
15. Reference Implementations
AWS
| Component |
Service |
| Intake + Rate Limiting |
Amazon API Gateway + AWS WAF |
| Rule-Based Classifier |
AWS Lambda with custom keyword/regex logic |
| LLM Classifier |
Amazon Bedrock (Claude 3 Haiku) with structured output |
| Dispatch |
AWS Step Functions (state machine with Choice state) |
| Specialist Handlers |
Lambda functions per route |
| Audit Logging |
CloudWatch Logs + Kinesis Firehose → S3 |
Azure
| Component |
Service |
| Intake |
Azure API Management (rate limiting, auth) |
| LLM Classifier |
Azure OpenAI Service (GPT-4o-mini) with JSON mode |
| Dispatch |
Azure Durable Functions (orchestrator with conditional dispatch) |
| Access Control |
Azure AD + custom RBAC claims on route |
On-Premises
| Component |
Technology |
| Classifier |
LangChain RouterChain or custom FastAPI classifier service |
| Dispatch |
Custom Python dispatcher with asyncio; LangGraph conditional edges |
| Specialist Handlers |
FastAPI microservices per route |
| Pattern |
ID |
Relationship Type |
Notes |
| Conditional Routing |
EAAPL-WRK011 |
Peer |
Conditional routing applies within a workflow; router/dispatcher routes at system entry point |
| Sequential Chain |
EAAPL-WRK002 |
Dispatches To |
Sequential chain is a common specialist handler target |
| Multi-Agent Orchestration |
EAAPL-MAG001 |
Peer |
Orchestration manages inter-agent coordination; dispatcher selects which agent to invoke |
| Human Escalation |
EAAPL-HITL001 |
Integrates With |
Low-confidence or restricted routes escalate to human handler |
| Workflow State Machine |
EAAPL-WRK012 |
Integrates With |
State machine can model multi-step dispatch decisions |
17. Maturity Assessment
Overall Maturity: Mature
| Dimension |
Score (1–5) |
Evidence |
| Research Foundation |
4 |
Intent classification well-established in NLP; LLM routing newer but evidence-based |
| Production Deployment |
5 |
Widely deployed in contact centres, enterprise AI platforms, developer tools |
| Framework Support |
4 |
LangChain RouterChain; Semantic Router library; Azure Prompt Flow routing |
| Tooling Maturity |
4 |
Intent classification tooling mature; confidence calibration still evolving |
| Observability |
4 |
Per-route metrics widely available; confidence calibration dashboards maturing |
18. Revision History
| Version |
Date |
Author |
Changes |
| 1.0 |
2025-06-13 |
Architecture Board |
Initial publication in Agentic Workflows category |