Threat modeling an MCP server — the parts AppSec tools miss
MCP servers have four distinct attack surfaces: transport, tool surface, prompt context injection, and auth boundary. Traditional threat modeling tools model the first and miss the other three. Here is the full threat model with controls.
The Model Context Protocol is becoming the standard way to connect LLMs to external tools, data sources, and services. Anthropic published the spec in late 2024; by mid-2025 it had been adopted by every major AI platform. If your organisation is building or procuring agentic AI systems, there is a reasonable chance an MCP server is already in the architecture.
The security community has not caught up. Most threat models for AI systems treat the tool layer as a black box — “the agent calls tools” — without modelling the protocol that carries those calls, the server that handles them, or the surfaces that an attacker can reach through them. This piece fills that gap.
An MCP server is not a generic API. It has four distinct attack surfaces, each with its own threat class. A control at one surface does not protect against threats at another.
What MCP actually is
MCP is a JSON-RPC protocol that defines how a host application (the LLM + planner) communicates with external servers that expose tools, resources, and prompt templates. The three primitives are:
- Tools — functions the LLM can invoke. Each tool has a name, description, and JSON Schema for its parameters. The LLM reads the description to decide when to call the tool and what parameters to pass.
- Resources— data the server exposes for the LLM to read. Files, database rows, API responses. Resources are injected into the LLM's context window.
- Prompts — reusable prompt templates the server provides. The host can request a prompt template and inject it into the conversation.
Transport options are stdio (local process), HTTP+SSE (remote), and WebSocket. The protocol is stateful — the host maintains a session with each server.
The security implication is immediate: the LLM reads tool descriptions and resource content as part of its context. Anything the MCP server sends into that context can influence the LLM's behaviour. This is not a theoretical risk — it is the design of the protocol.
MCP architecture — four attack surfaces
Each numbered surface is an independent attack vector. A control at the transport layer does not protect against a tool-surface attack.
The four attack surfaces
An MCP server has four distinct attack surfaces. They are independent — a control at the transport layer does not protect against a tool-surface attack, and vice versa. Traditional AppSec threat modeling tools model the transport layer well and miss the other three almost entirely.
1. Transport layer
The transport layer carries JSON-RPC messages between the host and the server. For stdio transports (local process), the attack surface is the local machine. For HTTP+SSE and WebSocket transports (remote), the attack surface is the network.
The MCP spec does not mandate authentication at the transport layer. Many implementations — particularly local stdio servers — have no authentication at all. Any process on the host that can reach the server's socket can connect and invoke tools.
2. Tool surface
The tool surface is the most dangerous attack surface in an MCP server. It has two distinct threat vectors that are often conflated.
The first is tool manifest poisoning: an attacker modifies the tool manifest — the name, description, or parameter schema — to cause the LLM to invoke tools with unintended parameters or to misrepresent what a tool does. Because the LLM reads the tool description to decide how to use the tool, a poisoned description is a direct path to goal manipulation.
The second is indirect prompt injection via tool response: a tool returns a response that contains instructions the LLM interprets as new goals or tool invocations. This is the MCP-specific variant of the indirect prompt injection attack that OWASP Agentic A2 describes. The tool channel is a trusted input path by design — which makes it a high-value injection vector.
3. Prompt context injection
Resources and prompt templates are injected directly into the LLM's context window. This is the intended behaviour — it is how MCP provides the LLM with data to reason about. It is also a direct injection path.
A resource that contains adversarial instructions — a file with embedded “ignore previous instructions” text, a database row with a crafted string, a URL response with injected markdown — will be read by the LLM as part of its context. If the planner does not distinguish between data and instructions in the context, the injection succeeds.
Prompt templates are a separate risk: if a template contains secrets, API keys, or internal policy text, any client that can request the template can exfiltrate that content. The MCP spec does not define access controls on prompt templates.
4. Auth boundary
The auth boundary covers two distinct problems: how the host authenticates the server, and what identity the server uses to access downstream resources.
Server authentication is the simpler problem. Without it, a malicious process can register itself as an MCP server with the same name as a legitimate one. The host connects to the impersonator and all tool calls — including any credentials passed as parameters — are intercepted.
Server identity for downstream accessis the harder problem. An MCP server that runs under the host application's identity, or under a privileged service account, can access all resources that identity can reach. A compromised server — or a server that has been manipulated via tool manifest poisoning — can use that identity to exfiltrate data or execute actions far beyond the intended scope of the tool.
The full threat table
Eleven threats across the four surfaces, with severity ratings and framework mappings. Severity is assessed for a typical enterprise deployment where the MCP server has access to internal data and can invoke actions with real-world effects.
MCP server threats — 11 entries across 4 attack surfaces
| Surface | Severity | Threat | Framework mapping |
|---|---|---|---|
| Transport | Critical | Unauthenticated server connection MCP server accepts connections without verifying the client's identity. Any process on the host can connect and invoke tools. | OWASP A9 · MITRE ATLAS AML.T0012 |
| Transport | High | Message tampering (non-TLS stdio/SSE) stdio and plain HTTP transports carry JSON-RPC messages without integrity protection. An attacker with local access can modify tool call parameters in transit. | OWASP A2 · MITRE ATLAS AML.T0051 |
| Transport | Medium | Replay attack on tool invocations Tool call messages lack nonces or timestamps. A captured message can be replayed to re-execute a tool with the same parameters. | OWASP A8 · NIST AI RMF MANAGE 2.2 |
| Tool surface | Critical | Tool manifest poisoning Attacker modifies the tool manifest (name, description, parameters) to cause the LLM to invoke tools with unintended parameters or to misrepresent what a tool does. | OWASP Agentic A1 · MITRE ATLAS AML.T0051 |
| Tool surface | Critical | Parameter injection via tool response A tool returns a response containing instructions that the LLM interprets as new goals or tool invocations. Classic indirect prompt injection via the tool channel. | OWASP Agentic A2 · MITRE ATLAS AML.T0051 |
| Tool surface | High | Unbounded tool execution No rate limit or approval boundary on tool calls. A compromised planner can invoke destructive tools (delete, send, write) in a loop without human intervention. | OWASP Agentic A3 · A7 |
| Tool surface | High | Privilege escalation via tool chaining Tool A returns a token or credential that Tool B uses to access a higher-privilege resource. The MCP server does not enforce that the chain stays within the original permission grant. | OWASP Agentic A4 · MITRE ATLAS AML.T0012 |
| Prompt context | Critical | Malicious resource injection MCP resources (files, URLs, database rows) are injected into the LLM context without sanitisation. A resource containing adversarial instructions hijacks the planner. | OWASP Agentic A2 · MITRE ATLAS AML.T0051 |
| Prompt context | High | Prompt template exfiltration The MCP server exposes prompt templates that contain system instructions, internal policies, or API keys embedded in the template text. | OWASP Agentic A5 · MITRE ATLAS AML.T0037 |
| Auth boundary | Critical | Server impersonation A malicious process registers itself as an MCP server with the same name as a legitimate one. The host connects to the impersonator and all tool calls are intercepted. | OWASP Agentic A9 · MITRE ATLAS AML.T0012 |
| Auth boundary | High | Overprivileged server identity The MCP server runs with the same identity as the host application or a privileged service account. A compromised server can access all resources the host can access. | OWASP Agentic A4 · NIST AI RMF GOVERN 1.7 |
Controls that close each threat
Seventeen controls across the four surfaces. Each control is specific enough to assign to an owner and verify. The lifecycle gate indicates when the control must be in place — not when it should be planned.
Controls that close MCP threats — 17 controls across 4 surfaces
| Surface | Control | Gate | Evidence |
|---|---|---|---|
| Transport | Mutual TLS or signed message envelopes on all non-stdio transports | Before pilot | TLS config review + certificate chain verified |
| Transport | Client authentication required before any tool invocation is accepted | Before pilot | Auth enforcement test: unauthenticated call returns 401 |
| Transport | Message nonces or timestamps to prevent replay | Before production | Replay test: captured message rejected on second submission |
| Transport | Transport-level audit log (connection events, auth failures) | Before production | Log sample showing connection and auth events |
| Tool surface | Tool manifest served from a signed, version-controlled source — not dynamically generated at runtime | Before pilot | Manifest source review + signature verification test |
| Tool surface | Tool descriptions and parameter schemas validated against a canonical schema before being passed to the LLM | Before pilot | Schema validation test with adversarial manifest |
| Tool surface | Tool responses treated as untrusted data — not as instructions — before being passed to the planner | Before pilot | Prompt template review confirming data/instruction boundary |
| Tool surface | Destructive tools (write, delete, send, execute) require explicit human approval before execution | Before pilot | Approval boundary test: destructive call blocked without approval |
| Tool surface | Per-session tool call rate limit to prevent runaway loops | Before production | Rate limit config + enforcement test |
| Tool surface | Full tool call log: tool id, parameters, caller session, timestamp, response hash | Before production | Log sample with all required fields |
| Prompt context | Resources injected into LLM context are sanitised — HTML/markdown stripped, instruction-like patterns flagged | Before pilot | Sanitisation test with adversarial resource content |
| Prompt context | Prompt templates do not contain secrets, API keys, or internal policy text | Before pilot | Template review + secret-scanning CI check |
| Prompt context | Resource access scoped to the minimum required for the task — no broad filesystem or database access | Before pilot | Access control test: out-of-scope resource returns 403 |
| Auth boundary | MCP server runs under a dedicated least-privilege identity — not the host application identity | Before pilot | IAM policy review showing separate identity with minimal grants |
| Auth boundary | Server identity verified by the host before connection is established (certificate pinning or registry lookup) | Before pilot | Impersonation test: unregistered server rejected |
| Auth boundary | Server identity credentials rotated on a defined schedule | Ongoing | Rotation schedule + last rotation date |
| Auth boundary | Anomaly detection on server identity usage (unexpected connection sources, unusual invocation patterns) | Ongoing | Alert rule + test trigger |
What goes in the disposition
When an MCP server is part of an assessed system, the disposition memo needs to address each of the four surfaces explicitly. The controls table above maps directly to the required controls section of the disposition.
The re-assessment triggers for an MCP-connected system are specific:
- A new MCP server is added to the system.
- An existing server's tool manifest changes (new tools, changed descriptions, new parameters).
- The server's transport changes (e.g. stdio → HTTP).
- The server's downstream access is expanded (new database, new API, new filesystem path).
- The server identity changes or its permission grant is modified.
These triggers should be registered in the disposition at the time of the initial review. Without them, scope creep — the most common failure mode for agentic systems — goes undetected until an incident surfaces it.
The OWASP Agentic Top 10 control map covers the broader agentic threat surface. This piece covers the MCP-specific layer that sits underneath it.
Model an MCP server in Drel
Drel treats MCP servers as first-class components in the agentic graph — with their own threat surface, control requirements, and re-assessment triggers. The output is a disposition memo that covers all four attack surfaces.
Blog
Get new posts in your inbox
AI security review, OWASP Agentic Top 10, ISO 42001 evidence, and what AI Committees actually need. No cadence promises — we publish when there's something worth reading.
A note on scope: Drel reviews assessed systems against documented architecture, configuration and intent. It does not ingest live telemetry from production environments. Dispositions reflect the assessed system at the time of review and the re-assessment triggers that govern when the disposition must be revisited.