EAAPLEnterprise AI Architecture Pattern Library
EAAPLLibraryAgentic Workflows
Mature
⇄ Compare

Tool Call Orchestration

📄 Agentic WorkflowsEU AI ActISO/IEC 42001

[EAAPL-WRK006] Tool Call Orchestration

Category: Agentic Workflows Sub-category: Tool Execution Architecture Version: 1.0 Maturity: Industry Standard Tags: tool-calling, function-calling, tool-use, parameter-extraction, result-injection, tool-budget Regulatory Relevance: ISO 42001 §8.4, APRA CPS 234, EU AI Act (Art. 9)


1. Executive Summary

The Tool Call Orchestration Pattern defines the execution mechanics for structured tool use within an agent's reasoning loop: how tools are selected from a registry, how parameters are extracted and validated, how tool results are injected back into context, how errors are handled, and how tool call budgets are enforced. While the Agent Tool Registry (EAAPL-AGT003) defines the registration and discovery contract for tools, this pattern covers the runtime orchestration of tool execution — the moment-to-moment mechanics of safely and reliably executing tool calls within an agentic workflow.

For CIO/CTO audiences: tools are what give AI agents their power to interact with the real world — querying databases, calling APIs, reading documents, writing records. But that power creates risk. An agent that can call any tool, with any parameters, without limits is an operational liability. This pattern defines the guardrails: parameter validation before execution, permission checking per tool call, result sanitisation before injection, error handling that does not silently corrupt the agent's reasoning, and hard limits on the number and cost of tool calls per task. These are not optional — they are the operational controls that make tool-using agents safe enough to deploy in regulated enterprise environments.


2. Problem Statement

Business Problem

Tool-using agents interact with live business systems — databases, APIs, file systems, communication services. Uncontrolled tool invocation creates operational risk: incorrect parameters corrupt data, excessive calls exhaust API quotas, and unhandled errors produce silent failures that the agent treats as successful tool calls.

Technical Problem

The raw output of an LLM tool-call inference step is a JSON object specifying a tool name and parameters. This raw output may contain: invalid parameter types, parameters that exceed allowed value ranges, calls to tools the current user is not authorised to use, and calls with hallucinated parameter values that would produce runtime errors. Simply forwarding this raw output to tool execution is insufficient.

Symptoms of Absence

  • Tool call errors are silently swallowed and misrepresented as successful observations in the agent scratchpad
  • No parameter validation: hallucinated parameter values cause downstream data corruption
  • No per-task tool call budget: runaway agents exhaust API quotas, incurring unexpected costs
  • Tool calls execute with the agent's full permissions regardless of the sensitivity of the specific tool

Cost of Inaction

  • Data Integrity: Unvalidated parameters passed to write-capable tools can corrupt production data
  • Cost Control: Unlimited tool calls create unpredictable cost exposure
  • Security: Unscoped tool permissions create lateral movement risk if an agent is compromised

3. Context

When to Apply

  • Agents execute tool calls within any reasoning loop (ReAct, Plan-and-Execute, Sequential Chain)
  • Tools interact with external systems, databases, or APIs
  • Per-task tool call budgets are required
  • Tool results contain potentially untrusted content that must be sanitised before context injection

When NOT to Apply

  • Pure LLM workflows with no external tool use
  • Fully sandboxed environments where tool isolation is provided by the execution environment (still apply parameter validation, but permission model may be simplified)

Prerequisites

  • EAAPL-AGT003 (Agent Tool Registry) for tool discovery and permission definitions
  • Tool call budget policy (max calls per task, max calls per tool type)
  • Parameter schema definitions for all registered tools
  • Result sanitisation policy per tool

Industry Applicability

Industry Tool Types Used Key Orchestration Requirement
Financial Services Database query, API calls, calculation engine Parameter validation to prevent SQL injection; result sanitisation
Legal Document search, court record lookup, drafting API Permission scoping per matter; budget control for search tools
Healthcare Clinical database, drug interaction API, EHR write Strict parameter validation for write tools; safety checks
Government Records system, geospatial API, regulatory database Audit every tool call; permission scoping per officer role
Technology Code execution, test runner, version control Sandbox enforcement; budget control for compute tools

4. Architecture Overview

The Tool Call Orchestration layer sits between the agent's LLM inference step and the actual tool execution infrastructure. It is a mandatory gate through which every tool call must pass.

Parameter Extraction and Validation The LLM produces a tool call specification in its inference output (either via native function calling JSON schema output or via parsed scratchpad action text). The Parameter Extractor parses this into a structured tool call object (tool_name, parameters dict). The Parameter Validator then validates every parameter against the tool's registered schema: type checking, value range validation, required field presence, and pattern matching. Invalid parameters are not forwarded to execution — they generate a correction observation that is injected back into the agent's context.

Permission Gate Before execution, the Permission Gate checks that the current task's permission scope includes the requested tool and the specific operation (read vs. write vs. admin). The permission scope is established at task initialisation time from the user's identity and the task type's permission policy. Tool calls outside the permission scope are rejected with a structured permission error — the agent can reason about this rejection and choose an alternative approach.

Tool Execution with Timeout Validated, permitted tool calls are forwarded to the tool execution layer (EAAPL-AGT003). Each call is wrapped in a timeout enforced by the orchestrator — a tool that hangs does not block the agent indefinitely. The timeout is configurable per tool type (fast API calls: 5s; slow database queries: 30s).

Result Sanitisation Tool results are processed through the Result Sanitiser before injection into the agent's context. Sanitisation: (a) enforces a maximum result length (truncates with a summary marker if exceeded), (b) strips potentially injected instruction patterns from string results (prompt injection defence), (c) validates the result schema against the tool's declared output schema.

Tool Call Budget Every tool call decrements the task's tool call budget. When the budget is exhausted, the orchestrator rejects further tool calls and injects a budget-exhausted observation into the agent's context, triggering the agent to synthesise its final answer from available information. The budget is tracked per tool type to enable fine-grained control (e.g., maximum 3 write operations, unlimited read operations).

Audit Record Every tool call — including rejected calls — is written to the task audit record: tool name, parameters (sanitised of secrets), result summary, permission outcome, timestamp, and budget state.


5. Architecture Diagram

ARCHITECTURE DIAGRAM
flowchart TD subgraph Agent["Agent Reasoning Loop"] A[LLM Tool Call] end subgraph Orchestration["Tool Call Orchestration Layer"] B[Parameter Extractor] C{Parameter Validation} D{Permission Gate} E{Budget Check} F[Tool Executor] G[Result Sanitiser] end subgraph Tools["Tool Registry"] H[Tool A: Database] I[Tool B: External API] J[Tool C: Write Op] end subgraph Feedback["Context Injection"] K[Valid Observation] L[Error Observation] end subgraph Audit["Audit"] M[(Tool Call Audit Log)] end A --> B B --> C C -->|invalid params| L C -->|valid| D D -->|denied| L D -->|permitted| E E -->|budget exhausted| L E -->|budget available| F F --> H & I & J H & I & J --> G G --> K K --> M L --> M

6. Components

Component Type Responsibility Technology Options Criticality
Parameter Extractor Logic Component Parses LLM tool call output into structured tool call object Native function calling parser; custom JSON/regex parser Critical
Parameter Validator Logic Component Validates parameters against tool schema Pydantic v2; JSON Schema validator; custom type checks Critical
Permission Gate Security Checks tool + operation against task permission scope Custom RBAC; OPA (Open Policy Agent); IAM policy evaluation Critical
Budget Controller Safety Tracks and enforces per-task, per-tool-type call budgets Counter in task state; configurable limits per task type Critical
Tool Executor Integration Invokes the registered tool with validated parameters; enforces timeout EAAPL-AGT003 tool invocation layer Critical
Result Sanitiser Security + Logic Truncates, validates, and sanitises tool results before context injection Custom Python; LangChain output parser; regex content filter Critical
Error Observation Generator Logic Produces structured correction observations for validation/permission/budget failures Custom; prompt templates per error type High
Tool Call Audit Logger Governance Records every tool call attempt with full metadata PostgreSQL; CloudWatch Logs; Splunk High

7. Data Flow

Step Actor Action Output
1 LLM Produces tool call in inference output {"tool": "search_regulatory_db", "params": {"query": "CPS 234", "limit": 10}}
2 Parameter Extractor Parses tool call object Structured: {tool_name: "search_regulatory_db", params: {query: str, limit: int}}
3 Parameter Validator Validates against registered schema: query (string ≤ 500 chars ✓), limit (int 1–50 ✓) PASS
4 Permission Gate Task permission scope includes "read:regulatory_db" — tool requires "read:regulatory_db" GRANTED
5 Budget Controller Task budget: 8/10 calls remaining for "read" operations PROCEED; budget decremented to 7/10
6 Tool Executor Invokes search_regulatory_db with validated params; 2s timeout Raw result: [{doc_id: "CPS234-§3.4", content: "...500 chars..."}]
7 Result Sanitiser Content length OK (800 chars < 2000 char limit); no injection patterns; schema valid Sanitised result
8 Context Injector Injects as Observation in agent scratchpad Observation: 3 documents found: [...]
9 Audit Logger Records: timestamp, tool, params, result_summary, budget_state, permission Audit entry persisted

Error Flow

Error Detection Recovery
Invalid parameter type (e.g. string passed for int field) Parameter Validator Inject: Observation: Tool call failed: parameter 'limit' must be integer, got string '10'. Correct and retry.
Tool call denied (permission not in scope) Permission Gate Inject: Observation: Tool 'write_record' is not permitted for this task. Available tools: [list of permitted tools]
Tool timeout Executor timeout wrapper Inject: Observation: Tool 'query_legacy_db' timed out after 30s. Consider an alternative approach or a simpler query.
Result exceeds max length Result Sanitiser Truncate; inject with truncation marker: Observation: [TRUNCATED at 2000 chars] First 2000 chars of result: [...]
Budget exhausted Budget Controller Inject: Observation: Tool call budget exhausted (10/10 calls used). Synthesise answer from available information.

8. Security Considerations

Parameter Injection into Tool Calls

  • LLM may hallucinate parameter values designed to exploit tools (e.g., SQL injection attempts in a database query parameter)
  • Mitigation: Tool implementations must use parameterised queries and never string-interpolate LLM-provided values; the Parameter Validator enforces type safety but does not substitute for safe tool implementation

OWASP LLM Top 10

OWASP LLM Risk Tool Call Orchestration Applicability Mitigation
LLM01 Prompt Injection Tool results injected into context may contain instructions Result sanitisation; content delimiters around all observations
LLM07 Insecure Plugin Design Tool parameters pass LLM output to external systems Parameter validation; tool implementations use parameterised APIs; no string interpolation
LLM08 Excessive Agency Write-capable tools can cause irreversible side effects Write-tool budget limits; human approval gate before write calls; permission scoping
LLM04 Model DoS Unlimited tool calls exhaust API quotas Per-task, per-tool-type call budgets enforced before execution

9. Governance Considerations

Write Tool Governance

  • Tools that write, update, or delete data in production systems must have separate governance from read tools
  • Write tool calls should require explicit human approval for irreversible operations (EAAPL-HITL001)
  • Write tool calls must be individually logged with the full parameter set (for audit and rollback)

Governance Artefacts

Artefact Owner Frequency Purpose
Tool Permission Policy Security + AI Governance On change; quarterly review Documents which tools are permitted per task type and user role
Tool Call Budget Policy FinOps + AI Governance Quarterly Documents budget limits per task type and tool category
Tool Call Audit Archive Compliance Per call; retained per policy Full record of every tool invocation for audit and investigation
Parameter Validation Schema Register AI Platform On tool registration or change Version-controlled schemas for all registered tools

10. Operational Considerations

SLOs

SLO Target Window Alert
Tool call success rate (validated + permitted + executed) ≥ 97% 1-hour rolling < 93% triggers P2
Parameter validation pass rate ≥ 98% 24-hour rolling < 95% triggers P3; review LLM tool call quality
Tool execution p95 latency ≤ tool-specific SLA (e.g., 5s for API, 30s for DB) 1-hour rolling Exceeds 2× SLA triggers P2
Budget exhaustion rate ≤ 3% of tasks 24-hour rolling > 8% triggers P3; review budget policy

Monitoring

  • Validation failure by tool and parameter field: identifies systematic LLM misuse of specific tool APIs
  • Permission denial rate trending: increasing denials may indicate agents attempting out-of-scope operations
  • Tool latency distribution: performance degradation in upstream tool dependencies

11. Cost Considerations

Cost Factor Driver Control
LLM inference for tool call generation Number of tool calls per task Per-task budget; efficient tool design to reduce required calls
External API call costs API pricing × call volume Per-task, per-tool budget; caching identical calls
Compute for parameter validation Negligible vs. LLM cost Not a significant optimisation target
Write tool risk cost Data corruption, API abuse, quota exhaustion Budget limits; permission scoping; monitoring

Budget Configuration Guidelines

Task Type Recommended Read Budget Recommended Write Budget
Information retrieval 10–20 read calls 0 write calls
Research and analysis 15–30 read calls 0–2 write calls (e.g., save result)
Automated processing 5–15 read calls 3–10 write calls (with approval gate)
Code generation + test 10 read calls 5 code execution calls

12. Trade-Off Analysis

Option Safety Flexibility Latency Overhead Complexity Best For
A: Full orchestration layer (Recommended) Very High High Low (< 10ms overhead) Medium Production agentic systems
B: Validation only (no budget/permission) Medium Very High Very Low Low Development/prototyping
C: Permission + budget only (no validation) Medium High Minimal Low Internal tools with trusted inputs
D: Direct tool invocation (no orchestration) Low Very High None None Sandboxed research only

Architectural Tensions

Tension Left Pole Right Pole Balance
Strict validation vs. Agent flexibility Reject any deviation from schema Accept anything; let tool handle errors Strict type validation; permissive on optional fields
Budget tightness vs. Task completion Very low budget (cost controlled) High budget (high completion rate) Set budget to p95 observed usage + 20% buffer
Result verbosity vs. Context efficiency Full tool result in context Summarised result only Full result up to limit; summarise on truncation

13. Failure Modes

Failure Mode Likelihood Impact Detection Recovery
Parameter hallucination (LLM generates wrong param values) Medium Medium — tool call fails; agent retries Validation failure rate per tool Validation error observation; agent self-corrects on retry
Tool result prompt injection Low High — agent hijacked Result sanitisation catches patterns Sanitise; delimit; anomaly alert if injection pattern detected
Budget exhausted too early (budget set too low) Medium Medium — task completes with partial information Budget exhaustion rate monitoring Tune budget policy based on observed p95 usage
Write tool called with stale data (race condition) Low High — data corruption Idempotency key; optimistic locking at tool level Idempotency key per write call (EAAPL idempotency guidance)
Timeout cascade (slow tool blocks entire task) Low–Medium Medium — task latency spike Per-tool timeout monitoring Per-tool timeout; error observation injected; agent uses alternative approach

14. Regulatory Considerations

EU AI Act

  • Art. 9 (Risk Management): Tool call orchestration controls (parameter validation, permission gate, budget) are risk management measures for agentic AI systems interacting with live business systems.

APRA CPS 234

  • Every tool call that accesses or modifies information assets must be logged (tool call audit log) and access must be scoped to minimum necessary permissions (permission gate).

ISO 42001

  • §8.4: The tool permission policy and budget policy are operational controls that must be documented, version-controlled, and regularly reviewed.

Australian Context

  • For AFS-licensed entities, write tool calls that affect customer records must be individually auditable and the full parameter set must be retained for dispute resolution.
  • OAIC: Tool calls that access personal information must be scoped to minimum necessary; the permission gate implements this control.

15. Reference Implementations

AWS

Component Service
Parameter Extraction + Validation Lambda function with Pydantic validation layer
Permission Gate AWS IAM policy evaluation per tool ARN; custom RBAC via DynamoDB
Tool Execution AWS Lambda per tool (invoked via SDK)
Budget Tracking DynamoDB counter per task; atomic decrement
Result Sanitisation Lambda function with content filtering
Audit Logging CloudWatch Logs → Kinesis → S3

Azure

Component Service
Orchestration Layer Azure Functions middleware chain
Permission Gate Azure AD + custom RBAC claims
Tool Execution Azure Functions per tool
Budget Tracking Azure Cosmos DB counter
Audit Logging Azure Monitor → Event Hubs → Blob Storage

On-Premises

Component Technology
Full Orchestration Custom Python orchestration layer; FastAPI middleware
Parameter Validation Pydantic v2 with tool schema registry
Permission Gate OPA (Open Policy Agent) with tool permission policies
Audit Log PostgreSQL append-only table

Pattern ID Relationship Type Notes
Agent Tool Registry EAAPL-AGT003 Depends On Registry provides tool schemas and permission definitions; orchestration enforces them at runtime
ReAct Agent Loop EAAPL-WRK001 Integrates With Every Action phase in ReAct passes through the tool call orchestration layer
Human Escalation EAAPL-HITL001 Integrates With Write tool calls may trigger human approval via escalation pattern
Workflow Tracing and Replay EAAPL-WRK013 Integrates With Tool call audit log is a primary input to the workflow trace
Iterative Constraint Satisfaction EAAPL-WRK015 Complementary Constraint checker can evaluate tool call plans before execution

17. Maturity Assessment

Overall Maturity: Industry Standard

Dimension Score (1–5) Evidence
Research Foundation 4 Function calling widely studied; tool use safety emerging literature
Production Deployment 5 Tool calling deployed at scale in OpenAI, Anthropic, Google APIs and all major frameworks
Framework Support 5 Native function calling in all major LLM APIs; LangChain tools; LlamaIndex tools
Parameter Validation Tooling 4 Pydantic + Instructor widely adopted; OpenAI structured output GA
Permission + Budget Tooling 3 Custom implementations common; standardised tooling emerging

18. Revision History

Version Date Author Changes
1.0 2025-06-13 Architecture Board Initial publication in Agentic Workflows category
← Back to LibraryMore Agentic Workflows