Mature

Router / Dispatcher

Agentic WorkflowsEU AI ActISO/IEC 42001

[EAAPL-WRK004] Router / Dispatcher

Category: Agentic Workflows Sub-category: Intelligent Routing Architecture Version: 1.0 Maturity: Mature Tags: routing, classifier, intent-detection, dispatcher, specialist-agents, fallback Regulatory Relevance: ISO 42001 §8.4, APRA CPS 230, EU AI Act (Art. 13)

1. Executive Summary

The Router / Dispatcher Pattern defines a central classifier that receives all incoming requests, determines their intent or type, and dispatches them to the appropriate specialist agent or processing chain. By separating routing logic from processing logic, this pattern enables a single entry point to handle diverse request types without any single processing chain becoming a bloated, multi-purpose monolith. The router acts as the traffic controller for a fleet of specialist workers.

For CIO/CTO audiences: rather than building one AI system that tries to handle every type of request, you build specialist handlers — each excellent at one task type — and a smart front door (the router) that sends each request to the right specialist. This mirrors how professional service firms operate: a legal query goes to the legal team, a financial query to finance, a compliance query to compliance. The difference is this routing happens in milliseconds, is transparent and auditable, and scales to millions of requests. The router is also the enforcement point for access control: it can block certain request types from reaching certain specialists based on user permissions.

2. Problem Statement

Business Problem

Enterprise AI systems receive diverse request types from users: policy questions, document generation, data retrieval, calculations, compliance queries. Routing all of these through a single general-purpose agent produces mediocre results across all categories because the agent must balance competing prompt instructions and cannot be deeply optimised for any single task type.

Technical Problem

A monolithic agent's prompt grows unbounded as new task types are added, reducing per-task quality. There is no mechanism to route high-confidence requests to fast, cheap handlers and complex requests to expensive, capable handlers. Access control cannot be enforced at the task-type level.

Symptoms of Absence

Single large prompt trying to handle all request types; quality average across all types rather than excellent for any
No cost optimisation by request type (all requests use the same expensive model)
Access control is coarse-grained; cannot restrict certain task types to authorised users
Adding a new task type requires modifying the monolithic prompt, risking regressions

Cost of Inaction

Quality: General-purpose handling produces lower quality than specialist handling across all domains
Cost: No ability to route simple requests to cheaper models
Maintainability: Monolithic handlers accumulate complexity and become unmaintainable

3. Context

When to Apply

Incoming request corpus is heterogeneous: multiple distinct task types with different optimal handling
Different task types have different model requirements (cost, capability, latency)
Access control at the task-type level is required
Specialist handlers already exist or can be built per task type

When NOT to Apply

All requests are of the same type (no routing needed)
Request types are not predictable upfront; types emerge from reasoning (use ReAct, EAAPL-WRK001)
Task type cannot be reliably classified from the request alone (low classifier confidence)
Only one or two task types with similar handling requirements

Prerequisites

Inventory of request types and their handling requirements
Confidence threshold policy (what to do when classifier is uncertain)
Fallback handler for unclassified or low-confidence requests
Specialist handlers (agents, chains) for each defined route

Industry Applicability

Industry	Router Use Case	Route Categories
Financial Services	Customer contact centre AI	Account query, dispute, product info, complaint, transfer
Legal	Document management AI	Contract review, research query, matter update, precedent search
Government	Citizen services portal	Benefit query, application status, regulatory question, escalation
Healthcare	Clinical decision support	Drug interaction check, guideline lookup, referral request, documentation
Technology	Developer productivity AI	Code generation, bug diagnosis, architecture question, documentation

4. Architecture Overview

The Router/Dispatcher has three logical layers: intake, classification, and dispatch.

Intake Layer The intake layer receives incoming requests, applies pre-processing (normalisation, PII detection, rate limiting, authentication context attachment), and prepares the request for classification. The intake layer is the first security enforcement point: requests from unauthenticated callers or blocked IP ranges are rejected before reaching the classifier.

Classification Layer The classifier receives the pre-processed request and produces a route decision: a route label (e.g., "contract_review", "policy_query", "data_extraction") and a confidence score. Classification can be implemented at multiple levels of sophistication:

Rule-based: Keyword matching and regex patterns; fast, deterministic, zero LLM cost; suitable for well-defined, structured request types
Embedding-based: Semantic similarity to route prototype embeddings; handles paraphrase variations; low latency, low cost
LLM-based: Full LLM classification prompt; handles complex, ambiguous requests; higher accuracy, higher latency and cost
Hybrid (Recommended): Rule-based fast path for high-confidence cases; LLM classifier for uncertain cases

Confidence Thresholding The classifier produces a confidence score. If confidence exceeds the high-confidence threshold, the request is dispatched immediately. If confidence falls between low and high thresholds, the request is dispatched but flagged for monitoring. Below the low threshold, the request is routed to the fallback handler or escalated for human clarification.

Dispatch Layer The dispatcher routes the classified request to the appropriate specialist handler with the full request context and route metadata. The specialist handler is invoked asynchronously; the dispatcher manages timeout, retry, and failure handling for the handler invocation.

Fallback Handler The fallback handler receives all low-confidence, unclassified, or handler-failure requests. It uses a capable general-purpose model to attempt to respond, clearly flagging to the caller that the specialised handling path was not used. Fallback responses are logged for use in improving the classifier and expanding the route inventory.

5. Architecture Diagram

ARCHITECTURE DIAGRAM

flowchart TD subgraph Intake["Intake Layer"] A[Incoming Request] B[Pre-processing] end subgraph Classification["Classification Layer"] C[Rule-Based Classifier] D{Confidence Threshold} E[LLM Classifier] D2{Route Decision} end subgraph Dispatch["Dispatch Layer"] F[Specialist Handler A] G[Specialist Handler B] H[Specialist Handler C] I[Specialist Handler N] J[Fallback Handler] end subgraph Output["Output"] K[Response + Metadata] L[Audit Log] end A --> B B --> C C --> D D -->|high confidence| D2 D -->|uncertain| E E --> D2 D2 -->|route A| F D2 -->|route B| G D2 -->|route C| H D2 -->|route N| I D2 -->|fallback| J F & G & H & I & J --> K K --> L

6. Components

Component	Type	Responsibility	Technology Options	Criticality
Intake Pre-processor	Security + Logic	Auth check, rate limiting, PII detection, normalisation	Custom middleware; AWS WAF; Azure API Management	Critical
Rule-Based Classifier	Logic Component	Fast path classification via regex, keyword, schema matching	Custom Python; regex engine; JSONPath	High
LLM Classifier	AI Component	Classifies ambiguous requests using prompt-based intent detection	GPT-4o-mini, Claude 3 Haiku (cost-optimised); structured output	Critical
Confidence Gate	Logic Component	Applies threshold policy; routes to fast/slow/fallback path	Custom threshold logic; configurable per route	Critical
Dispatch Router	Orchestration	Routes classified request to appropriate specialist handler	Custom; API Gateway routing; LangChain router chain	Critical
Specialist Handlers	AI Components	Dedicated agents/chains per task type	EAAPL-WRK001/002 implementations per route	Critical
Fallback Handler	AI Component	General-purpose handler for unclassified requests	GPT-4o; Claude 3.5 Sonnet	High
Access Control Enforcer	Security	Checks user permissions against route requirements	RBAC via identity provider; custom permission matrix	Critical
Route Audit Logger	Governance	Records every classification decision and route taken	PostgreSQL; CloudWatch Logs; Splunk	High

7. Data Flow

Step	Actor	Action	Output
1	User	Submits request: "What are our obligations under APRA CPS 234 for cloud services?"	Raw request + user auth context
2	Intake Pre-processor	Validates auth, checks rate limit, detects no PII, normalises text	Prepared request with user role: "compliance_analyst"
3	Rule-Based Classifier	Matches keywords: "APRA", "CPS 234", "obligations" → candidate route: "regulatory_compliance"	Confidence: 0.91 — high confidence
4	Confidence Gate	0.91 > high_confidence_threshold (0.85) → fast path	Route: "regulatory_compliance"; skip LLM classifier
5	Access Control	User role "compliance_analyst" has access to "regulatory_compliance" route	Access granted
6	Dispatch Router	Invokes Regulatory Compliance Specialist Handler	Handler invoked with request + route context
7	Specialist Handler	Executes compliance Q&A chain (EAAPL-WRK002)	Structured answer with regulatory references
8	Audit Logger	Records: timestamp, user_id, route, confidence, handler_version, latency	Audit record persisted
9	Caller	Receives response with route metadata	Response + `{route: "regulatory_compliance", confidence: 0.91, handler: "v2.3"}`

Error Flow

Error	Detection	Recovery
Classifier returns confidence below minimum threshold	Confidence Gate	Route to fallback handler; flag response
Specialist handler timeout	Dispatch Router timeout	Retry once; if timeout again, route to fallback with `status: handler_timeout`
Access denied (user lacks route permission)	Access Control	Return 403 with clear message; log access attempt
No matching route (new request type)	Route matching failure	Fallback handler; log for route inventory expansion

8. Security Considerations

The Router as Security Enforcement Point

Every request passes through the router; it is the single enforcement point for access control, rate limiting, and content filtering
Misconfiguration of the router (e.g., a missing route permission check) can expose specialist handlers to unauthorised access

OWASP LLM Top 10

OWASP LLM Risk	Router/Dispatcher Applicability	Mitigation
LLM01 Prompt Injection	User request is untrusted input to classifier	Input sanitisation before classification; delimit user content in classifier prompt
LLM08 Excessive Agency	Router dispatches to powerful specialist handlers	Access control per route; least-privilege: only dispatch to handlers the user is authorised for
LLM04 Model DoS	High request volume floods classifier	Rate limiting at intake; async queue for burst absorption
LLM07 Insecure Plugin Design	Specialist handlers are effectively plugins	Handler permission scoping: each handler only has access to its required tools/data

9. Governance Considerations

Route Inventory Management

Every route must be documented: route name, description, handling handler, access control requirements, confidence threshold
New routes require architecture board review before deployment
Retired routes must be removed from the route inventory and redirected to fallback

Governance Artefacts

Artefact	Owner	Frequency	Purpose
Route Inventory Register	AI Platform	On change; quarterly review	Canonical list of all routes, handlers, thresholds, and access control
Classifier Performance Report	ML Engineering	Monthly	Per-route precision/recall; identifies misclassification hot spots
Fallback Usage Report	AI Operations	Weekly	Tracks requests falling to fallback; drives route expansion decisions
Access Control Policy	Security	Quarterly	Route-to-user-role permission matrix

10. Operational Considerations

SLOs

SLO	Target	Window	Alert
Classification latency p95 (fast path)	≤ 50ms	1-hour rolling	> 200ms triggers P2; check rule engine performance
Classification latency p95 (LLM path)	≤ 2s	1-hour rolling	> 5s triggers P2
Route accuracy (correct route for known request type)	≥ 97%	Weekly eval	< 94% triggers P2; classifier retraining
Fallback rate (requests not classified to a specialist route)	≤ 5%	24-hour rolling	> 10% triggers P3; expand route inventory

Monitoring

Per-route request volume trending: route distribution shifts indicate changing user behaviour
Classifier confidence distribution: degrading confidence distribution indicates prompt drift
Fallback handler response quality: monitor for quality degradation in fallback responses

11. Cost Considerations

Classification Path	Latency	Cost per Request	Suitability
Rule-based only (fast path)	< 10ms	$0.000	High-confidence, well-defined request types
Embedding similarity	50–200ms	~$0.0001	Moderate variety; semantic matching needed
LLM classifier (small model)	500ms–2s	$0.001–0.005	Complex, ambiguous request classification
LLM classifier (large model)	2–5s	$0.01–0.05	Reserve for multi-label or novel request types

Optimisations

Implement rule-based fast path for the highest-volume, most predictable routes (typically 60–80% of traffic)
Use smallest capable model for LLM classification (GPT-4o-mini, Claude 3 Haiku)
Cache classification results for identical or near-identical requests (embedding similarity cache)

12. Trade-Off Analysis

Option	Accuracy	Latency	Cost	Flexibility	Best For
A: Hybrid rule + LLM classifier (Recommended)	High	Low–Medium	Low	High	Production; mixed traffic
B: Pure rule-based classifier	Medium	Very Low	Zero	Low	Very structured, predictable requests
C: Embedding similarity classifier	High	Low	Very Low	High	Semantic variety without LLM cost
D: LLM-only classifier	Very High	Medium	Medium	Very High	Novel or complex request taxonomy

Architectural Tensions

Tension	Left Pole	Right Pole	Balance
Classifier specificity vs. Coverage	Many narrow routes (each handled excellently)	Few broad routes (generalist handlers)	5–10 well-defined routes with a strong fallback
Classification speed vs. Accuracy	Fast rule-based (may misclassify edge cases)	Slow LLM-based (high accuracy, higher latency)	Hybrid: fast path for 70%+ of traffic
Access control granularity vs. Usability	Route-level ACL (fine-grained, complex)	System-level ACL (simple, coarse)	Route-level for regulated/sensitive routes; system-level for standard routes

13. Failure Modes

Failure Mode	Likelihood	Impact	Detection	Recovery
Classifier misroutes request to wrong specialist	Medium	Medium — poor quality response from wrong handler	Per-route user feedback; blind evaluation	Confidence threshold tuning; add rule for misclassified pattern
Specialist handler unavailable (all routes)	Low	Critical — all requests fail	Health check monitoring; circuit breaker	Fallback handler absorbs all traffic; alert
Access control bypass (misconfigured route permission)	Low	Critical — unauthorised access to sensitive handler	Security scanning; access log anomaly detection	Immediate route disable; incident response
Fallback handler overwhelmed	Medium	Medium — fallback quality degrades under load	Fallback error rate monitoring	Rate limit fallback; queue overflow; reject with retry-after
Route inventory stale (new request types exceed coverage)	Medium	Medium — high fallback rate	Fallback rate trending	Regular route expansion review driven by fallback log analysis

14. Regulatory Considerations

EU AI Act

Art. 13 (Transparency): The route taken for each request must be logged and available for audit. The route metadata (route label, confidence, handler version) is the transparency artefact.
For high-risk AI systems, the classifier decision must be explainable — embedding-based and LLM-based classifiers must produce the top contributing features or reasoning for the route decision.

ISO 42001

§8.4: The route inventory is an operational specification artefact; changes require change management.

Australian Context

APRA CPS 230: For material business processes routed through an AI dispatcher, the route configuration and audit log are operational resilience evidence.
OAIC automated decision guidance: The route taken is part of the automated decision trail and must be retainable for subject access requests.

15. Reference Implementations

AWS

Component	Service
Intake + Rate Limiting	Amazon API Gateway + AWS WAF
Rule-Based Classifier	AWS Lambda with custom keyword/regex logic
LLM Classifier	Amazon Bedrock (Claude 3 Haiku) with structured output
Dispatch	AWS Step Functions (state machine with Choice state)
Specialist Handlers	Lambda functions per route
Audit Logging	CloudWatch Logs + Kinesis Firehose → S3

Azure

Component	Service
Intake	Azure API Management (rate limiting, auth)
LLM Classifier	Azure OpenAI Service (GPT-4o-mini) with JSON mode
Dispatch	Azure Durable Functions (orchestrator with conditional dispatch)
Access Control	Azure AD + custom RBAC claims on route

On-Premises

Component	Technology
Classifier	LangChain RouterChain or custom FastAPI classifier service
Dispatch	Custom Python dispatcher with asyncio; LangGraph conditional edges
Specialist Handlers	FastAPI microservices per route

Pattern	ID	Relationship Type	Notes
Conditional Routing	EAAPL-WRK011	Peer	Conditional routing applies within a workflow; router/dispatcher routes at system entry point
Sequential Chain	EAAPL-WRK002	Dispatches To	Sequential chain is a common specialist handler target
Multi-Agent Orchestration	EAAPL-MAG001	Peer	Orchestration manages inter-agent coordination; dispatcher selects which agent to invoke
Human Escalation	EAAPL-HITL001	Integrates With	Low-confidence or restricted routes escalate to human handler
Workflow State Machine	EAAPL-WRK012	Integrates With	State machine can model multi-step dispatch decisions

17. Maturity Assessment

Overall Maturity: Mature

Dimension	Score (1–5)	Evidence
Research Foundation	4	Intent classification well-established in NLP; LLM routing newer but evidence-based
Production Deployment	5	Widely deployed in contact centres, enterprise AI platforms, developer tools
Framework Support	4	LangChain RouterChain; Semantic Router library; Azure Prompt Flow routing
Tooling Maturity	4	Intent classification tooling mature; confidence calibration still evolving
Observability	4	Per-route metrics widely available; confidence calibration dashboards maturing

18. Revision History

Version	Date	Author	Changes
1.0	2025-06-13	Architecture Board	Initial publication in Agentic Workflows category

Track this pattern for APRA/ASIC review

← Back to Library More Agentic Workflows →