BlogReference

An MCP server security review checklist

A structured checklist for reviewing an MCP server before connecting it to a production agent. Covers transport, authentication, tool manifest, context injection surface, third-party dependencies, and evidence requirements.

Drel Research30 November 202510 min read

This checklist brings together the review requirements from across the MCP security review cluster into a single structured document. It is designed to be used directly in an AI security review for an MCP server deployment — whether the server is third-party or internally developed, and whether it uses stdio or HTTP/SSE transport.

Each section corresponds to one of the MCP attack surfaces or review domains. Each item specifies the gate at which it must be completed (before pilot, before production, or ongoing) and the evidence required to close the item. An item is not closed by an assertion — it is closed by the specified evidence.

How to use this checklist

The checklist is structured around five security domains:

Transport — how the host and server communicate.
Tool manifest — the tools the server exposes and their descriptions.
Authentication — how the server verifies callers.
Context injection — how tool results and resources are handled before reaching the model.
Supply chain — for third-party servers, the vetting of the server itself.

Each item has a gate: “Before pilot” items must be completed before any production-like testing with real data or real users. “Before production” items must be completed before the server is connected to a production agent. “Ongoing” items must be addressed on a recurring schedule after production deployment.

Gaps at the “Before pilot” gate should be treated as blockers. A gap at this gate means the control is missing from the design — it will not be addressed by operational measures later. “Before production” gaps may be acceptable during a controlled pilot if explicitly tracked and assigned to a remediation owner. “Ongoing” gaps are operational risks that require a remediation plan and a timeline.

MCP server security review checklist

Area	Check	Evidence required
Transport — TLS	TLS enforced on all HTTP/SSE endpoints. Protocol version TLS 1.2 minimum. Certificate validation enabled in client and not bypassed.	TLS configuration excerpt. Certificate validation test result showing invalid cert is rejected.
Transport — Auth	Authentication required before any connection is accepted. Unauthenticated requests return 401. Credentials stored in secrets manager, not code.	Unauthenticated request rejection test. Secrets management policy reference.
Tool manifest	All tool descriptions reviewed for imperative language, unusual length, and encoding anomalies. Approved manifest committed as a version-controlled baseline.	Signed manifest baseline. Description review notes for each tool. Manifest drift detection test result.
Tool scope	Every tool in the manifest is justified by the intended use case. Tools with no in-scope use are removed or disabled from the session.	Tool justification record. Configuration showing excess tools are excluded from session.
Context injection	System prompt explicitly frames tool results as untrusted data. MCP client validates tool results before injecting into context. Data source write-access controls documented.	System prompt review. Output validation test result. Data source access control documentation.
Authentication boundary	Per-user auth is implemented: end-user identity is forwarded to the MCP server as a verifiable token. Tool handlers enforce per-user access controls.	Auth architecture review showing user identity flow. Cross-user isolation test result.
Supply chain (third-party servers)	Server version pinned. Source review completed. Dependency vulnerability scan clean. Permission scope documented and justified. Re-review cadence registered.	Source review notes. Vulnerability scan output. Permission scope documentation. Version pin in deployment config.
Audit logging	Full tool-invocation audit log captures: tool name, parameters, invoking identity, end-user identity, result summary, and timestamp. 90-day retention minimum.	Sample audit log entries showing required fields. Log retention policy confirmation.

Transport checklist

For the detailed reasoning behind each transport control, see Transport security for MCP servers.

Transport — HTTP/SSE

Review item	Gate	Evidence required
TLS 1.2+ enforced for all connections. Plaintext HTTP rejected.	Before pilot	TLS configuration excerpt showing protocol version and cipher suites.
Certificate validation enabled in the MCP client. Not bypassed for any deployment environment.	Before pilot	Test result: connection with invalid or expired cert is rejected.
Certificate issued by a trusted CA and within validity period.	Before pilot	Certificate details (issuer, expiry) and rotation schedule.
Server is not accessible from the public internet (or internet exposure is explicitly reviewed and approved).	Before pilot	Network access control documentation or network scan confirming scope.
Mutual TLS in place for high-sensitivity deployments.	Before production	mTLS configuration and client certificate provisioning record.
Transport-level connection events and auth failures logged.	Before production	Log sample showing connection and auth events.

Transport — stdio

Review item	Gate	Evidence required
MCP server subprocess runs under a least-privilege OS account, not the host application account.	Before pilot	Process user account and permission review.
Subprocess does not have access to credential files, SSH keys, or cloud configuration outside its stated scope.	Before pilot	Filesystem permission audit for the subprocess account.
In containerised deployments, subprocess isolation is documented.	Before pilot	Container configuration review showing subprocess isolation.

Tool manifest checklist

For the detailed reasoning behind tool manifest controls — including tool poisoning mechanics — see Tool poisoning in MCP servers.

Tool manifest

Review item	Gate	Evidence required
Every tool in the manifest is justified for this deployment. No excess capabilities.	Before pilot	Manifest review document with one-line justification per tool.
Tool descriptions are accurate and minimal. No imperative language, no embedded instructions to the model.	Before pilot	Description review notes per tool. Reviewer sign-off.
Parameter schemas contain no instruction-like content in description fields.	Before pilot	Schema review notes. Any parameter description containing verbs reviewed.
Manifest is served from a version-controlled, authenticated source. Not dynamically generated at runtime from untrusted input.	Before pilot	Manifest source review showing version control and signing.
Approved manifest baseline is documented and signed. Deviations from the baseline are detectable.	Before pilot	Signed baseline artefact. Manifest diffing mechanism or test.
Destructive tools (write, delete, execute, send, modify) require an explicit approval step before the model can invoke them.	Before pilot	Approval boundary design review. Test confirming destructive tool is blocked without approval.
Behavioral test cases covering tool-poisoning scenarios have been executed and results documented.	Before production	Test case list and results.
Manifest re-review is scheduled on version change and on a periodic cadence.	Ongoing	Re-review schedule in the security review record.

Authentication checklist

For the detailed reasoning behind authentication controls — including the client-auth vs user-auth gap — see The MCP authentication boundary, reviewed.

Authentication and authorization

Review item	Gate	Evidence required
Client authentication required and enforced from connection establishment. Unauthenticated connections rejected before any manifest is served.	Before pilot	Auth enforcement test: unauthenticated connection returns rejection.
Per-user authentication is implemented for multi-user agent deployments. End user identity is verified, not asserted.	Before pilot	Authentication architecture review showing user identity flow. Token validation test.
Per-user authorization is enforced at the tool handler layer. User A cannot access User B's data through the agent.	Before pilot	Cross-user access test: tool invocation with User A identity cannot retrieve User B data.
MCP server credentials are managed through a secrets manager. Not hardcoded in source code or deployment configuration.	Before pilot	Secrets management review. CI secret scanning output showing no hardcoded credentials.
MCP server runs under a least-privilege service identity. Not the host application identity or a broadly-privileged account.	Before pilot	IAM policy review showing service identity and granted permissions.
Downstream systems are called with user-scoped credentials where per-user access controls exist.	Before production	Downstream call implementation review.
Credential rotation schedule is defined and implemented.	Ongoing	Rotation schedule and last rotation date.

Context injection checklist

For the detailed reasoning behind context injection controls, see Prompt-context injection through MCP tools.

Context injection

Review item	Gate	Evidence required
System prompt explicitly frames tool results as untrusted data. Instructions for handling instruction-like content in tool results are present.	Before pilot	System prompt review confirming data/instruction framing.
Tool results are validated before being passed to the model. HTML/markdown stripping or instruction-pattern flagging is in place.	Before pilot	Output validation implementation review. Adversarial payload test result.
Data sources accessed by MCP tools are inventoried. Write access to each data source is documented.	Before pilot	Data source inventory with access control model for each source.
MCP tools access only the data sources required for their stated function. No broad filesystem or database access.	Before pilot	Access scope review. Access control test: out-of-scope resource returns rejection.
Prompt templates do not contain secrets, API keys, or internal policy text.	Before pilot	Template review. Secret scanning of template content.
Behavioral tests covering context injection scenarios have been executed and results documented.	Before production	Test case list and results.

Supply chain checklist

This section applies to third-party MCP servers. For internally developed servers, replace this section with the internal server checklist from Securing an internal MCP server exposed to agents. For the full vetting process, see Vetting a third-party MCP server before you connect it.

Supply chain — third-party MCP servers

Review item	Gate	Evidence required
Server maintainer is identified and accountable. Security disclosure process is documented.	Before pilot	Source review notes with maintainer identification.
Source code is reviewable (open source) or vendor has a documented security attestation.	Before pilot	Source review or vendor security attestation.
Server version is pinned in the deployment configuration. Floating version dependencies are not in use.	Before pilot	Deployment configuration showing pinned version.
Dependency vulnerability scan has been run against the pinned version. Findings documented and triaged.	Before pilot	Vulnerability scan output for the pinned version.
All server dependencies are sourced from standard package registries. No dependencies from personal or unverified sources.	Before pilot	Dependency provenance review.
Permissions granted to the server process are documented and scoped to the minimum required.	Before pilot	Permission scope documentation.
Version change review process is defined: who re-reviews on update, and at what cadence.	Before production	Review cadence documented in the security review record.
Dependency vulnerability scanning is scheduled to run periodically, not only at initial review.	Ongoing	Scheduled scan configuration or CI integration.

Evidence requirements

A completed MCP server security review should produce a structured evidence pack that can be presented to a governance committee, a procurement review, or a post-incident inquiry. The evidence pack should include:

Evidence is not an assertion that a control is in place. Evidence is an artefact that demonstrates it. A configuration file that enables a control, a test result that verifies it, or a log sample that shows it operating — these are evidence. A document that says “TLS is enabled” is an assertion. A TLS configuration excerpt and a certificate validation test result are evidence. The distinction matters when the evidence is examined under governance review or following an incident.

Required artefacts

Completed checklist — this checklist with each item marked pass, fail, or not applicable, with evidence reference for each passed item and remediation owner for each failed item.
Approved tool manifest baseline — a version-controlled, signed copy of the tool manifest as reviewed. Stored alongside the review record.
Test results — results of enforcement tests (unauthenticated rejection, certificate validation, cross-user access) and behavioral tests (tool-poisoning and context-injection scenarios).
Configuration excerpts — TLS configuration, IAM policy, secrets manager integration, network access control — as applicable.
Data source inventory — listing of data sources the MCP tools access, with access control model for each.
Residual risk record — any items that could not be fully remediated, with the residual risk description, the accepted risk owner, and the compensating controls in place.
Re-review schedule — the conditions that trigger a re-review (version change, configuration change, incident) and the periodic review cadence.

For deeper analysis of any individual section of this checklist, refer to the corresponding cluster article:

Blog

Get new posts in your inbox

AI security review, OWASP Agentic Top 10, ISO 42001 evidence, and what AI Committees actually need. No cadence promises — we publish when there's something worth reading.

Run a structured MCP server security review with Drel

Drel's AI security review produces a completed version of this checklist, with evidence collected against each item and a structured evidence pack for governance sign-off.

Request early access See the demo dossier

A note on scope: Drel reviews assessed systems against documented architecture, configuration and intent. It does not ingest live telemetry from production environments. Dispositions reflect the assessed system at the time of review and the re-assessment triggers that govern when the disposition must be revisited.