[EAAPL-MCP001] MCP Server Design
Category: MCP Sub-category: Server Architecture Version: 1.0 Maturity: Emerging Tags: mcp, model-context-protocol, json-rpc, tool-exposure, resource-management, prompt-templates, server-lifecycle Regulatory Relevance: EU AI Act Art. 9 (Risk Management), APRA CPS 234, ISO/IEC 42001 §8.4, NIST AI RMF (GOVERN 1.6, MANAGE 2.2), Privacy Act 1988 (Cth)
1. Executive Summary
The MCP Server Design Pattern describes how to design, implement, and operate a conformant Model Context Protocol server — the foundational component that exposes tools, resources, and prompt templates to AI models via the JSON-RPC 2.0-based MCP wire protocol. An MCP server is the authoritative boundary between an AI model's reasoning capabilities and the enterprise systems, data sources, and executable actions it can interact with. This pattern covers the three primitive capability types (tools, resources, prompts), the server lifecycle (initialisation, capability negotiation, request handling, shutdown), and the architectural decisions that determine whether a server is secure, observable, and operable in a production enterprise environment.
For CIO/CTO audiences: every organisation building AI workflows will eventually need to give AI models controlled access to internal systems — databases, APIs, document stores, calendars, code execution environments. MCP is the emerging open standard (developed by Anthropic, now broadly adopted) for this integration layer. Designing MCP servers correctly from the outset — with proper capability boundaries, input validation, audit logging, and secrets management — determines whether your AI integrations are a governed enterprise asset or an uncontrolled risk surface. This pattern is the foundation all other MCP patterns depend on.
2. Problem Statement
Business Problem
Organisations integrating AI models with internal systems repeatedly build bespoke, one-off integration layers that are incompatible across models and frameworks. Each integration is a custom engineering effort with no shared governance model, no reusable security controls, and no audit trail. When AI models are given direct access to production systems without a well-designed intermediary, the result is an ungoverned capability boundary that security and compliance teams cannot assess or control.
Technical Problem
Without a standardised server design, tool invocations lack schema validation, resource access lacks authorisation enforcement, and prompt injection can traverse directly to backend systems. The absence of a defined server lifecycle means capability negotiation is ad hoc, version mismatches cause silent failures, and there is no clean shutdown path — leaving dangling connections and incomplete audit records. JSON-RPC transport concerns (framing, error codes, notification handling) are reinvented per integration.
Symptoms of Absence
- Each AI tool integration is a custom implementation with no shared schema validation or error handling
- Backend systems receive raw, unvalidated inputs from AI model outputs
- No audit log exists of which model called which tool with what arguments and what result was returned
- Server capability set drifts over time with no versioning or negotiation mechanism
- Secrets (API keys, database credentials) are embedded in prompt context or passed as tool arguments
Cost of Inaction
- Security: Unvalidated tool inputs expose backend systems to prompt-injection-driven attacks; secrets leak through model context
- Compliance: Absence of tool invocation logs is a material gap under APRA CPS 234 and EU AI Act Art. 9 risk management requirements
- Operational: Bespoke integrations break silently on protocol version changes; no health monitoring detects degraded tool availability
- Engineering: Each new model or agent requires a full re-implementation of the integration layer
3. Context
When to Apply
- Building any server that exposes enterprise capabilities to an AI model via MCP
- Standardising existing bespoke tool integrations under a governed, auditable protocol layer
- The organisation requires a complete audit log of all AI-to-system interactions
- Multiple AI models or agents (potentially from different vendors) need access to the same enterprise capabilities
- Regulatory requirements demand demonstrable controls over what actions AI can take against internal systems
When NOT to Apply
- Simple single-model proof-of-concept with no production data access and no persistence beyond the session
- The integration is purely read-only retrieval of public data with no access controls or PII involved (a full MCP server adds overhead that may not be warranted)
- An existing enterprise integration platform already exposes the required capabilities via a model-native SDK that meets security and audit requirements
Prerequisites
- Clear enumeration of which capabilities (tools, resources, prompts) the server will expose
- Secrets management infrastructure (not environment variables in the server process for production deployments)
- Observability stack capable of receiving structured tool invocation logs
- Input validation library or framework for JSON Schema validation of tool arguments
- Network egress controls defining which backend systems the MCP server process is permitted to reach
Industry Applicability
| Industry | Requirement | Key Concern | Adoption Level |
|---|---|---|---|
| Financial Services | High — AI models accessing core banking, risk, and reporting systems need auditable, governed integration | Data integrity, audit trail completeness, secrets isolation | Early Adopter |
| Healthcare | High — AI accessing EHR, pharmacy, and imaging systems requires strict input validation and access scoping | Patient data protection, My Health Record Act, input sanitisation | Early Adopter |
| Government | Medium — APS AI deployments integrating with case management and citizen data systems | Sovereignty, data classification enforcement, APS8 access controls | Pilot |
| Technology/SaaS | High — AI-native products exposing internal APIs to LLM orchestration layers | Rate limiting, multi-tenant isolation, developer experience | Mainstream |
| Retail | Medium — AI accessing inventory, order, and CRM systems for customer-facing agents | PII handling, transactional integrity, inventory consistency | Pilot |
4. Architecture Overview
An MCP server is structured around three orthogonal capability primitives: tools (executable functions the model can invoke with arguments), resources (data and content the model can read, identified by URI), and prompts (parameterised prompt templates the model can retrieve for structured interactions). A well-designed server exposes a coherent, minimal set of each — never exposing more capability than the model needs for its current role.
The server lifecycle begins with transport establishment (stdio for local processes, HTTP+SSE or WebSocket for remote servers), followed by an initialisation handshake in which the client sends a JSON-RPC initialize request carrying its protocol version and capabilities, and the server responds with its own capabilities declaration. This capability negotiation step is not optional — it is the mechanism by which protocol version mismatches are detected before any tool call is attempted. The server should reject initialize requests from clients whose protocol version it does not support.
Request handling follows the JSON-RPC 2.0 contract strictly: every request carries a unique id, tools are invoked via tools/call, resources via resources/read, and prompt templates via prompts/get. The server MUST validate all tool argument payloads against the tool's declared JSON Schema before passing them to backend handlers. The backend handler layer should be thin — input validation, authorisation check, backend call, response serialisation — with no business logic embedded in the MCP server itself. The server is a controlled gateway, not an application server.
Cross-cutting concerns — authentication, authorisation, audit logging, rate limiting, and secrets management — must be addressed at the server layer, not delegated to individual tool handlers. A per-request context object carrying the authenticated identity, a correlation ID, and a request timestamp should be threaded through every handler invocation. Every tool call result (success or error) must be written to an immutable audit log before the JSON-RPC response is returned to the client.
Wire Protocol Detail
The following JSON-RPC 2.0 exchanges define the exact wire format a conformant MCP server must implement. These are not illustrative — they are the normative message shapes from the MCP 2024-11-05 specification.
Initialisation handshake — client sends first, server responds with its capabilities declaration:
// initialize request (client → server)
{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{"tools":{},"roots":{"listChanged":true}},"clientInfo":{"name":"claude-code","version":"1.0"}}}
// initialize response (server → client)
{"jsonrpc":"2.0","id":1,"result":{"protocolVersion":"2024-11-05","capabilities":{"tools":{"listChanged":true},"resources":{"subscribe":true,"listChanged":true}},"serverInfo":{"name":"your-server","version":"1.0"}}}
The server MUST reject any initialize request whose protocolVersion it does not support — it must NOT silently fall back to a different version. The initialized notification (method "notifications/initialized", no id) is sent by the client after receiving the response and before issuing any capability requests.
Tool invocation — the tools/call method carries the tool name and its argument object exactly as declared in the tool's JSON Schema:
// tools/call request (client → server)
{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"query_database","arguments":{"table":"customers","filter":"active=true","limit":100}}}
Error responses — standard JSON-RPC 2.0 error codes apply; MCP defines additional application-level codes in the error data field:
// schema validation failure (-32602 Invalid params)
{"jsonrpc":"2.0","id":2,"error":{"code":-32602,"message":"Invalid params","data":{"field":"limit","issue":"Must be ≤500 per APRA data minimisation policy"}}}
// tool execution failure (MCP-specific -32002)
{"jsonrpc":"2.0","id":2,"error":{"code":-32002,"message":"Tool execution failed","data":{"tool":"query_database","cause":"Backend connection timeout after 10000ms"}}}
MCP standard error codes:
| Code | Meaning | When to use |
|---|---|---|
-32700 |
Parse error | Malformed JSON received |
-32600 |
Invalid Request | JSON-RPC structure violation (missing jsonrpc, method, or id) |
-32601 |
Method not found | method value is not a recognised MCP method |
-32602 |
Invalid params | Tool arguments fail JSON Schema validation |
-32603 |
Internal error | Unhandled server exception |
-32001 |
Resource not found | resources/read URI does not exist |
-32002 |
Tool execution failed | Handler ran but the backend operation failed |
-32003 |
Prompt not found | prompts/get template identifier does not exist |
A server MUST NOT return HTTP 200 with a JSON-RPC error object for transport-level failures — use the appropriate HTTP status code. A server MUST return HTTP 200 with a JSON-RPC error object for application-level failures (schema validation, tool execution errors, resource not found).
5. Architecture Diagram
6. Components
| Component | Responsibility | Technology Examples |
|---|---|---|
| Transport Handler | Frames and deframes JSON-RPC 2.0 messages; manages connection lifecycle for stdio, HTTP+SSE, and WebSocket transports | Node.js @modelcontextprotocol/sdk, Python mcp library, custom stdio framer |
| Capability Negotiator | Handles initialize / initialized handshake; validates client protocol version; builds and returns server capabilities declaration |
MCP SDK built-in; custom version enforcement layer |
| Request Router | Dispatches validated JSON-RPC requests to the appropriate capability handler based on method name | Express router, FastAPI route handlers, SDK dispatcher |
| Tool Handler Registry | Holds the set of registered tool definitions (JSON Schema + handler function); validates arguments before handler invocation | MCP SDK server.tool() registration, Zod/Pydantic schema validation |
| Resource Handler Registry | Manages URI-addressable resource providers; handles resources/list and resources/read including pagination |
MCP SDK server.resource(), URI template matching |
| Auth + Authorisation Middleware | Verifies caller identity (OAuth 2.1 token, mTLS, API key); enforces per-tool and per-resource access policy | OAuth 2.1 bearer token introspection, OPA policy engine, custom RBAC |
| Audit Logger | Writes an immutable, structured record of every tool invocation and resource read before response is returned | AWS CloudTrail, Azure Monitor, Elasticsearch, Splunk |
7. Implementation Steps
Step 1: Define the Capability Surface
Before writing any code, enumerate the exact set of tools, resources, and prompts the server will expose. For each tool, write the complete JSON Schema for its arguments and define the authorisation tier required to invoke it. For each resource, define the URI scheme and the data classification of the content it may return. This upfront definition becomes the server's governance artefact — it is reviewed by security before implementation begins, not after.
Step 2: Implement Transport and Lifecycle
Bootstrap the server using the official MCP SDK for your language (TypeScript @modelcontextprotocol/sdk or Python mcp). Implement the transport layer first — stdio for local process deployment, Streamable HTTP for remote deployment. Implement the initialize handler to validate the client's protocol version and return the server's capabilities declaration. Implement a clean shutdown handler that closes backend connections and flushes the audit log buffer before the process exits.
Step 3: Register Capabilities with Schema Validation
Register each tool with its complete JSON Schema definition. The SDK will surface these schemas to the model during capability discovery. For every tools/call invocation, validate the incoming arguments against the registered schema before passing them to the handler — reject with a structured JSON-RPC error (-32602 Invalid params) if validation fails. Never pass raw model output directly to a backend system without schema validation.
Step 4: Implement Cross-Cutting Middleware
Layer authentication, authorisation, and audit logging as middleware that wraps every handler invocation. Authentication should validate the caller's identity before any routing occurs. Authorisation should check the authenticated identity against the tool's required access tier. Audit logging should record the caller identity, tool name, sanitised arguments (redacting PII and secrets), the result status, and latency — written to the audit log before the response is dispatched.
AU Compliance Note — APRA CPS 234 Attachment C & APS 330: APRA CPS 234 Attachment C requires entities to test the effectiveness of information security controls, including "the completeness and accuracy of records of access to information assets." MCP tool invocation audit logs directly satisfy this requirement — each log entry is the access record for the information asset the tool exposes. Audit logs must capture the authenticated identity, timestamp, tool name, and outcome for every invocation, with no gaps. Under APS 330 (Public Disclosure), APRA-regulated entities must retain records sufficient to reconstruct the basis of AI-assisted decisions; the MCP audit log is that record for tool-mediated AI actions. Minimum retention period: 7 years. Use WORM storage (S3 Object Lock, Azure Immutable Blob Storage) and log-signing to satisfy the "complete and accurate" integrity requirement. Any gap in the audit sequence — even a single missed tool invocation — is a material control deficiency reportable under CPS 234 §36.
8. Security Considerations
OWASP LLM Top 10 Mapping
| OWASP ID | Threat | Mitigation |
|---|---|---|
| LLM01 | Prompt Injection — adversarial content in resource reads or tool arguments alters server behaviour | Validate all tool arguments against JSON Schema; treat all model-originated strings as untrusted input |
| LLM02 | Insecure Output Handling — tool results containing executable content passed back to model without sanitisation | Sanitise tool return values; never interpolate raw tool output into system prompts |
| LLM06 | Excessive Agency — server exposes more tools or broader permissions than the model's current task requires | Minimal capability exposure per server role; per-session capability scoping |
| LLM09 | Overreliance — model treats tool results as authoritative without error checking | Return structured error types; implement result confidence metadata where applicable |
| LLM10 | Model Theft / Unbounded Consumption — server acts as amplifier for resource exhaustion | Per-identity rate limiting; token budget enforcement on resource reads |
Additional Controls
- Never store secrets (API keys, database credentials) in environment variables accessible to the server process in production — use a secrets manager (AWS Secrets Manager, Azure Key Vault, HashiCorp Vault)
- Enforce network egress filtering: the MCP server process should only be able to reach the specific backend hostnames it needs, not the entire internal network
- Implement request size limits on tool argument payloads to prevent memory exhaustion attacks
- Rotate tool invocation audit log signing keys quarterly; archive signed logs to WORM storage for APRA CPS 234 compliance
- Apply content-type validation to resource reads — do not attempt to deserialise unexpected MIME types
9. Governance Artefacts
- MCP Server Capability Register — lists every tool, resource, and prompt template exposed, with owner, access tier, data classification, and last review date
- Tool Argument Schema Library — versioned JSON Schema definitions for all tool inputs, stored in version control
- Tool Invocation Audit Log — immutable structured log of every tool call: timestamp, caller identity, tool name, arguments (sanitised), result status, latency
- Server Secrets Inventory — register of all secrets the server requires, their rotation schedule, and their vault location
- Capability Change Control Record — approval record for any addition, modification, or removal of a tool or resource
10. SLOs
| SLO | Target | Measurement |
|---|---|---|
| Tool invocation latency (p99) | < 2 000 ms end-to-end | Distributed trace span from tools/call receipt to response dispatch |
| Capability negotiation success rate | > 99.9% | Count of initialize exchanges that complete without protocol error |
| Audit log write success rate | 100% — no tool response returned without audit record | Log pipeline completeness check; alert on any gap |
| Schema validation rejection rate | < 0.5% of tool calls | Count of -32602 errors; high rate indicates model prompt engineering issue |
| Server availability | > 99.5% | Health endpoint probe from client-side synthetic monitor |
11. Cost Model
| Cost Driver | Estimate | Notes |
|---|---|---|
| MCP server compute (EC2 t3.small, ap-southeast-2) | ~AU$22/month | Single t3.small in Sydney region; suitable for low-to-medium invocation volumes. Scale to t3.medium (~AU$44/month) for > 500k calls/month |
| Serverless MCP (API Gateway + Lambda, ap-southeast-2) | ~AU$3.50 per million invocations | Includes API Gateway HTTP API ($1.29/M) + Lambda compute ($0.20/M at 256 MB × 500 ms avg); optimal for bursty tool call patterns |
| CloudWatch log retention (7yr APRA CPS 234 / APS 330 compliant) | ~AU$0.033/GB/month | S3 Glacier Deep Archive via log export; 1M tool calls/day at 1 KB/record ≈ 30 GB/month ≈ AU$1/month/server at 7yr tier |
| Secrets manager | AU$0.55/secret/month (AWS ap-southeast-2) + AU$0.07 per 10,000 API calls | Typically < AU$15/month per server |
| Schema validation library | Open source (Zod, Pydantic) | No licence cost; engineering time for schema authoring ≈ 2–4 hours per tool |
| Engineering (initial build) | 3–6 weeks for a production-grade server with auth, audit, and monitoring | Reuse drops this to 1–2 weeks for subsequent servers using shared libraries |
| Total TCO at 10M tool calls/month | ~AU$380–650/month | Lower bound: serverless Lambda + S3 audit + Secrets Manager. Upper bound: t3.medium VM + ElastiCache for rate limiting + CloudWatch Insights queries. Excludes upstream AI model costs |
12. Trade-off Analysis
| Dimension | Benefit | Trade-off |
|---|---|---|
| Strict JSON Schema validation | Prevents malformed inputs reaching backend systems | Increases development time per tool; overly strict schemas break on minor model output variation |
| Remote HTTP+SSE transport vs stdio | Supports multi-client, scalable deployment | Higher latency and infrastructure complexity vs local stdio |
| Centralised auth middleware | Consistent policy enforcement across all tools | Single point of failure; auth service downtime blocks all tool calls |
| Immutable audit log before response | Guarantees no tool call is unlogged even on crash | Adds latency to every tool call (typically 5–20 ms for async log write) |
| Minimal capability exposure per server | Reduces blast radius if the server is compromised | Requires more servers (and more operational overhead) to cover the full capability surface |
13. Failure Modes
| Failure | Trigger | Recovery |
|---|---|---|
| Capability negotiation failure | Client and server protocol versions incompatible | Server returns structured initialize error with supported version range; client falls back or alerts operator |
| Schema validation storm | Model produces consistently invalid tool arguments | Server returns -32602 with schema diff; alert fires on rejection rate threshold; prompt engineering review triggered |
| Backend system timeout | Downstream API exceeds tool handler timeout | Return structured isError: true tool result with timeout message; do not leave backend connection hanging; circuit breaker opens after N consecutive timeouts |
| Audit log write failure | Log pipeline unavailable | Server returns HTTP 503 and refuses to complete the tool call — never sacrifice the audit record for a tool result |
| Secret rotation failure | Secrets manager returns stale or expired credential | Tool handler catches auth error from backend, triggers secret refresh, retries once; alerts on second failure |
14. Regulatory Mapping
| Regulation | Clause | Requirement | How Pattern Addresses It |
|---|---|---|---|
| APRA CPS 234 | §21 | "An APRA-regulated entity must maintain an information security capability commensurate with the size and extent of threats to its information assets" | MCP server's JSON Schema input validation and per-tool access controls constitute the information security capability for AI-to-system interfaces; the capability register documents scope and criticality |
| APRA CPS 234 | §24 | Notification of material information security incidents within 72 hours | MCP audit log alerting (alert on audit gap, alert on elevated error rates) provides the detection mechanism; the audit log provides the evidentiary basis for the incident notification to APRA |
| Privacy Act 1988 (Cth) | Schedule 1 APP 11.1 | "An APP entity must take reasonable steps to protect personal information it holds from misuse, interference and loss, and from unauthorised access, modification or disclosure" | PII redaction in tool argument audit records; schema-validated resource access scoped to authorised identities; WORM audit log prevents post-hoc modification |
| EU AI Act | Art. 9(5) | "appropriate data governance and management practices covering the data used by the AI system" | Resource access controls enforced at the capability handler layer; data classification in the capability register governs which resources AI models may read; audit log provides the governance record |
| ISO/IEC 42001 | §8.4 | AI system documentation must cover data inputs and system interfaces | Capability register and schema library satisfy the interface documentation requirement |
| NIST AI RMF | GOVERN 1.6 | Policies and procedures for AI risk management must be documented and enforced | Capability Change Control Record and Audit Log satisfy the governance documentation and enforcement requirements |
15. Reference Implementations
AWS
Deploy the MCP server as an AWS Lambda function behind API Gateway (HTTP API) for the Streamable HTTP transport. Use Lambda Powertools for structured audit logging to CloudWatch Logs, with log group retention set to match APRA CPS 234 retention requirements (7 years). Retrieve secrets via the AWS Parameters and Secrets Lambda Extension to avoid cold-start latency on Secrets Manager calls. Tool handler timeouts map to Lambda function timeouts; set them conservatively (10–30 seconds) and configure reserved concurrency to limit blast radius.
Azure
Deploy as an Azure Container App with the Streamable HTTP transport. Use Azure Key Vault references in Container App secrets to avoid storing credentials in environment variables. Route audit logs to Azure Monitor Log Analytics with a Data Collection Rule enforcing immutability. Use Azure API Management as the authentication front door, validating OAuth 2.1 tokens via the validate-jwt policy before requests reach the MCP server container.
On-Premises / Self-Hosted
Package the MCP server as a Docker container and deploy via Kubernetes. Use HashiCorp Vault Agent Injector to mount secrets as in-memory files (never environment variables). Route audit logs to a Fluentd sidecar that forwards to an Elasticsearch cluster with index lifecycle management enforcing append-only writes. Expose the server via an Envoy sidecar that handles mTLS termination and enforces per-client rate limits before traffic reaches the MCP server process.
16. Related Patterns
- EAAPL-MCP002: MCP Gateway — when multiple MCP servers need a unified authentication, routing, and policy enforcement front door
- EAAPL-MCP003: Multi-Server Orchestration — composing multiple MCP servers for complex AI workflows
- EAAPL-MCP004: MCP Authentication & Authorisation — detailed patterns for securing MCP server access
- EAAPL-AGT003: Agent Tool Registry — enterprise-wide tool catalogue that MCP servers can register with
- EAAPL-AGT004: Agent Sandboxing — isolating MCP server execution environments for high-risk tool handlers
17. Maturity Assessment
| Dimension | Level | Notes |
|---|---|---|
| Tooling | 3 | Official TypeScript and Python SDKs are production-quality; other language SDKs are community-maintained and vary in completeness |
| Community Adoption | 3 | Rapid adoption across major model providers and agent frameworks (Claude, Cursor, Zed, Continue); enterprise-grade tooling still maturing |
| AU Enterprise Readiness | 2 | Protocol is sound; enterprise-grade reference architectures, compliance mapping, and managed hosting options are still emerging in the AU market |
| Regulatory Clarity | 2 | No specific regulatory guidance names MCP; mapping to existing frameworks (CPS 234, Privacy Act) is practitioner-led, not regulator-confirmed |
18. Revision History
| Version | Date | Change |
|---|---|---|
| 1.0 | 2026-06-14 | Initial release |