Agentic AI security review — for agents that take real actions.
Agents that call APIs, write to databases, send emails, and spawn sub-agents operate at a blast radius that most security reviews are not designed for. Here is what reviewing one actually requires.
What makes agentic AI different
A conventional LLM-based system takes input, produces output, and stops. The output might be displayed to a user, logged to a database, or passed to another system — but the model itself does not act. An agentic AI system is different in a fundamental way: it takes actions. It calls tools. It sends API requests. It writes to databases, sends emails, creates calendar events, executes code, spawns sub-agents, and interacts with external services. The model's output is not the end of the process — it is the instruction that drives the next step.
This distinction changes the security model entirely. For a non-agentic LLM, the harm surface is bounded by what the model can say. For an agentic system, the harm surface is bounded by what the agent can do — and “what the agent can do” is determined by its tool set, its permissions, its delegation scope, and the safeguards (if any) that gate its actions.
The security review frameworks that were designed for conventional software — STRIDE, PASTA, DREAD — model components and data flows. They do not model autonomy, goal-seeking behavior, or tool-use side effects. An agent that sends a calendar invite on behalf of a user does not appear in a STRIDE model as a threat. But if that agent's tool access is not scoped — if it can invite any recipient, at any time, with any content — the scope of what it can do on behalf of a compromised or manipulated session is substantial.
The four agentic attack surfaces
Agentic systems introduce four attack surfaces that do not appear in standard LLM threat models. A security review that does not cover all four is not reviewing the full system.
1. The tool surface.Tools are the agent's interface to the world. Each tool definition specifies what the agent can do: read a file, send an HTTP request, execute a database query, send an email, call a third-party API. The tool surface is the attack surface. An agent with ten tools has a fundamentally different risk profile than an agent with two. The security review catalogs every tool the agent can invoke, assesses the scope of each tool (can it read any file, or only files in a specific directory?), and identifies which tools create irreversible side effects.
2. The delegation chain.In multi-agent systems, agents spawn sub-agents and delegate tasks to them. Each spawning event is a delegation of authority: the orchestrator authorizes the sub-agent to act within some scope. The threat at this layer is unconstrained delegation — the sub-agent inherits the full permissions of the orchestrator, rather than a scoped subset. If an orchestrator has broad tool access and spawns a sub-agent without scoping its delegation, the sub-agent effectively has the orchestrator's full permissions. A compromised or manipulated sub-agent can then exercise permissions that were never intended for it.
3. The orchestration graph. Complex agentic systems have not one agent but a graph of agents: an orchestrator that coordinates multiple specialist sub-agents, each with their own tool sets and their own system prompts. The orchestration graph introduces threats that do not exist in single-agent systems: lateral movement between agents (a compromised sub-agent influencing the orchestrator or other sub-agents), confused deputy attacks (an agent acting on behalf of a party with different permissions than the requesting user), and audit trail gaps (actions taken by sub-agents that are not attributed to the originating user or session).
4. Memory and context persistence.Agents that operate across sessions, or that maintain a persistent memory store, introduce a threat that does not exist in stateless systems: contamination across sessions. If an agent's memory store can be poisoned — by a malicious user, by an injected tool response, or by a compromised sub-agent — that contamination can influence the agent's behavior in subsequent sessions with different users. The security review covers what is persisted, who can write to the memory store, and whether the memory content is validated before it influences agent behavior.
Blast radius and scope creep
Blast radius is the maximum harm an agent can cause if it is fully compromised — through prompt injection, through a manipulated tool response, or through a deliberate misuse of its permissions. It is the most important single metric in an agentic security assessment, and it is almost never explicitly documented.
Calculating blast radius requires cataloging the worst-case outcome of each tool the agent can invoke. An agent with access to a database write tool has a blast radius that includes all data that tool can modify. An agent with access to an email-sending tool has a blast radius that includes every recipient that tool can address and every attachment it can send. An agent with access to a code-execution tool has a blast radius that includes every operation the execution environment permits.
Blast radius is not calculated from runtime behavior — it is calculated from the tool definitions and the permissions those tools carry. This is why a design-time review can assess it: the tool surface is documented in the agent's configuration, not only observable in production.
Scope creep is the gradual expansion of an agent's tool access and permissions over time, driven by legitimate product requirements but without corresponding re-assessment of the security implications. An agent that starts with access to a read-only database tool acquires a write tool, then an external API call tool, then an email tool, then the ability to spawn sub-agents — and at no point does anyone explicitly recompute the blast radius of the system that now exists.
The re-assessment trigger for scope creep is straightforward: any time a new tool is added, a new integration is introduced, or the delegation scope is expanded, the blast radius assessment is stale and a new review is required. This trigger must be named explicitly in the clearance decision — not assumed, not remembered, but written down.
Delegation chain analysis
A delegation chain is the sequence of authorizations that allows an orchestrator agent to spawn sub-agents, and each sub-agent to invoke tools on behalf of the originating session. Each link in the chain represents a transfer of authority — and each transfer introduces the risk of authority expansion, authority confusion, or authority misuse.
The central question in delegation chain analysis is whether each delegation is explicitly scoped or implicitly inherited. Explicit scoping means the orchestrator spawns the sub-agent with a defined permission set: “you may call tool A and tool B, with parameters constrained to this scope.” Implicit inheritance means the sub-agent receives the orchestrator's full permissions by default, without a documented constraint. In most multi-agent implementations, delegation is implicit — the sub-agent can do anything the orchestrator can do, without any mechanism to enforce a narrower scope.
The confused deputy attack applies directly to multi-agent systems. A confused deputy occurs when an agent acts on a request with higher authority than the requesting party is entitled to. In multi-agent context: a sub-agent is spawned by an orchestrator that has access to sensitive data, but the sub-agent is responding to a request that originated from a user with lower clearance. The sub-agent, acting with the orchestrator's permissions, surfaces data that the originating user should not see. The sub-agent is not malicious — it is confused about whose authority it is acting under.
A delegation chain analysis produces a documented permission scope for each agent in the graph, a trace of which authorizations flow between agents, and an identification of which delegation events are unconstrained. The control plan requires that every delegation event carry an explicit, documented permission scope — not an inherited one.
Agentic AI Risk Register Template
Pre-populated with OWASP Agentic Top 10 plus system-specific risk slots. Attack path, controls applied, residual risk, named acceptor. Free download.
OWASP Agentic Top 10 mapping
The OWASP Agentic Top 10 provides a structured taxonomy of threats specific to tool-using and multi-agent AI systems. A security review maps each identified threat to this taxonomy and to the control category that addresses it.
A01 — Prompt Injection in Agentic Context. An agent that receives tool responses, environment observations, or orchestrator instructions can be injected through any of these channels. Unlike passive LLM injection, agentic injection triggers tool calls — making the impact immediate and often irreversible. Control category: input validation across all agent inputs, not just the user turn.
A02 — Excessive Agency. The agent is granted more capability — tool access, resource permissions, autonomous decision authority — than is required for its defined task. Control category: principle of least privilege for tool definitions; documented justification for each granted capability.
A03 — Privilege Escalation. Through prompt injection, tool manipulation, or orchestration graph exploitation, the agent gains capabilities beyond its defined permission set. Control category: explicit permission scoping for delegation; tool invocation validation against declared permissions.
A04 — Context Manipulation.The agent's context window is poisoned with false information — through a malicious tool response, a crafted environment observation, or injected memory — causing the agent to form false beliefs about its state and take actions based on them. Control category: tool response validation; memory integrity controls.
A05 — Insecure Memory.The agent's persistent memory can be written by unauthorized parties, is not validated before influencing behavior, or persists contamination across session boundaries. Control category: memory write access controls; memory content validation; session boundary enforcement.
A06 — Tool Injection.A malicious tool is introduced into the agent's tool set — through a compromised tool registry, a malicious plugin, or an MCP server that returns unexpected tool definitions. Control category: tool registry access controls; tool definition validation at load time.
A07 — Unauthorized Lateral Movement. A compromised or manipulated agent accesses other agents, services, or data stores that it should not be able to reach. Control category: network segmentation between agents; explicit trust policies between orchestrators and sub-agents.
A08 — Supply Chain Risks. The agent depends on external models, tool libraries, or orchestration frameworks that introduce vulnerabilities through their supply chain. Control category: dependency provenance; version pinning; review of third-party tool definitions.
A09 — Audit Trail Deficiency.The agent's actions — tool calls, delegation events, sub-agent spawning — are not comprehensively logged, making incident investigation and compliance evidence impossible. Control category: comprehensive action logging; log integrity controls; attribution of all tool calls to originating session.
A10 — Identity Spoofing. An agent claims an identity it does not hold — presenting itself as a trusted orchestrator to a sub-agent, or as a user to an external service — to obtain elevated access or to evade accountability controls. Control category: agent identity verification; cryptographic attestation for high-trust delegation chains.
What a cleared agentic system looks like
A clearance decision for an agentic system is more specific than for a non-agentic LLM, because the conditions that must be met are more concrete. The cleared state is not absence of risk — it is a documented configuration where the risk is bounded and where the bounds are enforced.
Scope-limited tool access.Each tool in the agent's definition is constrained to the minimum scope required. Read access is not also write access. API calls are bounded to specific endpoints. Email tools specify allowed recipients. Database tools specify allowed tables and operations. Scope limits are documented in the tool definitions, not in the system prompt — the system prompt is not an enforcement mechanism.
Scoped delegation tokens. Every delegation event in the orchestration graph carries an explicit, documented permission scope. Sub-agents receive scoped tokens, not full permission inheritance. Delegation scope is enforced technically, not assumed.
Comprehensive audit log for all tool calls. Every tool invocation is logged with: the tool name, the parameters, the response, the originating session, the agent identity, and the timestamp. The log is immutable. It is reviewed as part of the re-assessment process.
Re-assessment trigger for new tools. The clearance decision names a specific trigger: any addition of a new tool, integration, or change to the delegation scope requires a new review before the change is deployed to production. This trigger is owned by a named individual or team.
Documented blast radius. The clearance documentation includes an explicit calculation of the blast radius — the maximum harm achievable if the agent is fully compromised — based on the current tool set and permission scope. This is not a theoretical worst case; it is a bounded calculation from the actual configuration.
Frequently asked questions
- What is a delegation chain?
- The sequence of authorizations that allow an orchestrator agent to spawn sub-agents, and each sub-agent to invoke tools. Each link in the chain should carry an explicitly scoped permission set — not an inherited one. Unconstrained delegation is the most common source of privilege escalation in multi-agent systems.
- How is blast radius calculated?
- The maximum damage achievable if the agent were fully compromised: data that could be exfiltrated, actions that could be taken, systems that could be accessed, downstream agents that could be influenced. Drel maps this from the tool surface and permission scope documented in the agent's configuration — not from runtime behavior.
- Why doesn't STRIDE work for agents?
- STRIDE models components and data flows between defined services. It doesn't model autonomy, goal-seeking behavior, or tool-use side effects. An agent that sends emails on behalf of a user doesn't appear as a threat in STRIDE — but the scope of those emails, and the conditions under which the agent sends them without user confirmation, is a concrete security concern that STRIDE has no category for.
- Does Drel monitor agents at runtime?
- No. Drel reviews agents from design artifacts: tool definitions, delegation policies, system prompts, and architecture documentation. It does not connect to running agents, observe tool calls in production, or ingest telemetry. The review is a design-time process.
- What is MAESTRO and how does it apply?
- MAESTRO is a threat modeling framework for multi-agent systems. It covers orchestrator-agent trust models, lateral movement between agents, supply chain risks in agent dependencies, and audit trail requirements for agent actions. Drel maps agentic systems to MAESTRO alongside OWASP Agentic Top 10, because the two frameworks cover complementary aspects of multi-agent security.
- What triggers a re-assessment?
- Adding a new tool, changing the orchestration graph, expanding the delegation scope, changing the model, or deploying to a new data environment. These triggers are named explicitly in the clearance decision — not left as a general reminder to review periodically.