BlogTechnical

Attack path analysis for AI systems — beyond CVE scoring

CVE scores tell you a vulnerability exists. Attack path analysis tells you whether it is reachable, exploitable, and connected to a blast radius that matters. Here is how to apply it to AI system assessments.

Drel15 June 202613 min read

Most AI security reviews still inherit the vulnerability-management mental model. A risk is named, a severity is attached, a control is recorded. That model works when the unit of analysis is a known vulnerability with a published score. It does not work when the unit of analysis is a path — the sequence of design choices, trust boundaries, and authorisation grants that together turn a possibility into an exploitable reality.

Attack path analysis is the missing layer. It asks a different question. Not “is this risk present?” but “can an attacker reach the harm, and what would they touch on the way?” For AI systems, that question is the one the AI Committee actually needs answered, and it is the one CVE scoring does not address.

What CVE scoring gives you, and doesn't

CVSS produces a base score for a named vulnerability. The score reflects exploitability metrics and impact metrics, computed against an assumed worst case. It is useful: it lets engineering teams triage at scale. But it answers a narrow question — “how bad is this vulnerability in isolation?” — and most of the work for an AI security review is in the questions that come after.

A CVSS 9.8 vulnerability on a service that is not reachable from any untrusted input is operationally a low risk. A CVSS 5.4 issue that is reachable from a public form, chains into elevated authorisation, and lands in a system holding customer PII is operationally a critical risk. The base score does not capture either situation correctly without environmental and temporal adjustments — and even those adjustments do not enumerate the path.

AI systems make the limitation sharper. Most of the relevant AI risks — indirect prompt injection, tool misuse, descriptor poisoning, confused-deputy retrieval — do not have CVEs at all. They are not implementation bugs. They are design choices and trust-boundary decisions. CVSS has no opinion on whether your RAG pipeline marks retrieved content as data or as instruction. It has no opinion on whether your agent inherits the user's authorisation when calling a tool. It has no opinion on whether the disposition memo names a re-assessment trigger.

These are the questions that matter. They are answered by walking the path.

AI system attack path types — risk level and detection

Path type	Description	Risk	Detection method
Indirect prompt injection via RAG	Attacker plants content in a data source the RAG system retrieves. The injected content reaches the model's context window and redirects its behaviour.	Critical	Behavioral monitoring for anomalous tool chains following retrieval; output scanning for unexpected instruction-following patterns.
Agentic privilege escalation	Agent is manipulated into invoking tools or capabilities it was not authorised to use — through injected permission claims, tool chaining, or memory poisoning.	Critical	Tool invocation audit log review; cross-user access boundary tests; alert on tool-call sequences that match known escalation patterns.
Tool poisoning via manifest	An attacker modifies tool descriptions in an MCP manifest to cause the model to invoke tools in unintended ways without code-layer exploitation.	High	Manifest integrity verification against signed baseline; description review for imperative language; behavioral tests with adversarial manifests.
Cross-tenant data extraction via agent	A multi-tenant agent system lacks per-user auth at the MCP layer, allowing users to access other tenants' data by crafting queries that retrieve cross-tenant content.	High	Cross-tenant boundary tests; retrieval result inspection for tenant ID leakage; audit log analysis for anomalous user-to-record relationships.
Vendor subprocessor compromise	A third-party model provider or AI subprocessor is compromised or changes terms, causing customer data to be handled outside the assessed controls.	Medium	Vendor change-notification monitoring; model-version logging with drift detection; periodic subprocessor DPA term review.

Why AI systems need path analysis

The first thing path analysis surfaces, when applied to AI, is that the “input boundary” concept fails. In classical web application threat modelling, the boundary between “untrusted” user input and the system is concrete. The user submits a form. The form data is validated. Trusted server-side processing follows.

AI systems blur this. The user submits a query, yes — but a retrieval-augmented system also pulls in content from a knowledge base, and a tool-using agent also pulls in content from tool responses. Each of those content sources can carry instructions the LLM will follow. The trust boundary is not the input boundary; the trust boundary is wherever content enters the LLM's context. Path analysis is the technique that makes this visible.

The second thing path analysis surfaces is that most of the leverage in AI security sits at the architectural level, not the operational level. The control that breaks a path is usually a design choice — “retrieved content is marked as data, not instruction” or “destructive tools require human approval” — not a runtime mitigation. CVE-style thinking pushes towards runtime mitigations because that is where vulnerabilities are patched. AI security needs the architectural lens.

The third thing path analysis surfaces is the relationship between feasibility and blast radius. Two paths with similar feasibility can have very different consequences depending on what they reach. A prompt-injection path that ends at a generated paragraph is different from a prompt-injection path that ends at a tool invocation with side effects. Path analysis makes this distinction explicit.

What an attack path is

An attack path is a sequence: entry → escalation → action. Each segment names a specific thing the attacker controls, a specific thing they leverage, and a specific outcome.

Entry is where attacker-controlled content enters the system. For AI, entries include the obvious — user input — and the less obvious: knowledge-base writes, upstream document feeds, tool responses from third-party APIs, model output that becomes the next agent's input. The path analysis starts with enumerating entries honestly; teams routinely miss the indirect ones.

Escalation is what gives the attacker capability they didn't start with. In a classical web app, escalation is usually a privilege issue. In an AI system, escalation is usually a trust-boundary issue: retrieved content treated as instruction, an agent inheriting user authorisation it shouldn't carry, a confused deputy elsewhere in the orchestration.

Action is the terminal outcome. For AI systems, actions split into two broad classes: output actions (the LLM produces text that affects a downstream consumer) and tool actions (the LLM triggers a side effect through a tool call). Path analysis takes the action seriously: a path that ends in an email being sent on the user's behalf is a different operational reality from a path that ends in a paragraph being rendered.

Building the path graph

A useful path graph names four kinds of node: assets (data and systems with value), identities (humans, services, agents),capabilities (what each identity can do), anddata flows (where content moves and under what trust).

Edges connect them: an authorisation grant connects an identity to a capability; a retrieval path connects a data store to a context window; a tool invocation connects an agent to a downstream system; a delegation chain connects an orchestrator to a sub-agent. Each edge has a trust posture: fully trusted, partially trusted with validation, untrusted.

The analysis is straightforward in concept: from each entry node, walk forward through edges, accumulating capability and trust as you go. Wherever the walk reaches an action that would be undesirable if attacker-controlled, you have a path. The disposition documents the path, the controls that break it, the controls that don't but might be added, and any residual path that is accepted with rationale.

In practice you do not enumerate every path. You enumerate the paths that lead to high-blast-radius actions, and you record the rest as a class. “Paths to output-only actions where the output is not action-triggering” can be treated as a class with a class-level control (output content filtering for the relevant patterns) rather than itemised.

Example: indirect prompt injection in RAG

A retrieval-augmented assistant serves an internal knowledge base. Users ask questions; the system retrieves relevant documents, assembles a prompt, generates a response, and returns it with citations. The path:

Entry. An attacker (insider, compromised account, vendor with ingest access) writes a document into the knowledge base. The document contains adversarial content: text that, when surfaced to the LLM as retrieved context, will be interpreted as an instruction rather than as data.
Escalation. A user query causes the retriever to surface the adversarial document. The prompt-assembly layer concatenates retrieved content directly into the prompt. The LLM sees the injection text in a context where instructions live, and follows it.
Action. The injection instructs the LLM to leak details from another retrieved document, to invoke a tool with adversary-chosen parameters, or to alter its output in a way the requesting user will not detect.

The classical “prompt injection” framing treats this as one risk. Path analysis splits it into three concrete control points: write authorisation on the knowledge base, prompt design at assembly, output validation at generation. Each control point breaks the path. Breaking any one of them closes the path, but the AI Committee should know which ones are in place and which are accepted as residual.

Example: agentic privilege escalation

A procurement agent helps users look up suppliers. The agent has access to a catalog tool and a pricing tool. The orchestrator authenticates the user, then invokes the agent on the user's behalf. The agent has its own service identity when calling downstream tools.

Entry. A user crafts a query that triggers a tool call with parameters chosen to access supplier records the user is not authorised to see.
Escalation.The agent invokes the tool with its service identity. The tool authorises the request based on the agent's broader scope, not the user's narrower scope. The result reaches the agent.
Action. The agent renders the result into the response. The user receives data they were not authorised to access.

The control point is identity. Either the tool must enforce authorisation against the originating user (not the agent), or the agent must constrain its scope to the user's authorisation at the time of the request. Both are valid; the choice is design. What is not valid is the implicit pattern where the agent inherits its own broad service scope on every call regardless of who triggered the request. That is the confused-deputy pattern, and path analysis surfaces it as a structural issue rather than a per-tool one.

Example: vendor sub-processor compromise

A customer-facing AI feature uses a vendor SaaS for content classification. The vendor is itself layered on a third-party model provider as a sub-processor. The path:

Entry.The vendor's sub-processor is compromised, or the sub-processor changes its model in a way that affects behaviour the vendor did not document.
Escalation.The vendor's service propagates the changed behaviour to your AI feature. Classification results that previously suppressed certain content now permit it, or vice versa.
Action. End users encounter the changed behaviour. Depending on the use case: harmful content reaches users, legitimate content is suppressed, or downstream automated workflows behave incorrectly.

The control points here are contractual and procedural rather than technical: sub-processor notification, model-change notification, and re-assessment triggers that fire when the vendor stack changes. Path analysis makes the control plan explicit: which contract clause, which re-assessment trigger, which monitoring process. The disposition records who owns each.

What controls break each path

The point of enumerating paths is to identify where a control breaks the chain. A single control rarely addresses every path; conversely, a single path can usually be broken in multiple places. The disposition should record which break points are in place and which are accepted as open.

For the RAG indirect-injection example, three break points: write authorisation on the knowledge base (closes the path at entry); prompt-design boundary between retrieved content and instructions (closes the path at escalation); output validationfor instruction-following anomalies (closes the path at action). A defensible disposition usually has at least two of the three.

For the agentic confused-deputy example, two break points: identity propagation to downstream tools (closes the path at escalation); or authorisation enforcement against originating user at the tool layer (closes the path at action). Pick one and implement it consistently.

For the vendor sub-processor example: contractual notification(closes the path before action by surfacing the change), and monitoring for behavioural drift (closes the path by detecting and reacting). Both are usually required; neither alone is sufficient.

Blast radius — what's downstream

Path analysis without blast-radius accounting produces a list of paths but no way to triage them. Blast radius is the scope of consequence that follows from a successful action — and for AI systems, blast radius is usually the deciding factor in which paths get controls now and which are accepted as residual.

A useful blast-radius rating considers four dimensions: reach(how many users / records / systems are affected), reversibility(can the action be undone, and by whom), data sensitivity(classification of any data exposed), and regulatory implication(whether the action triggers disclosure obligations).

Two paths with the same likelihood and the same entry/escalation profile can have very different blast radius. A path that ends in an in-app paragraph for a single user differs from a path that ends in a mass email; a path that ends in an advisory document differs from a path that ends in a financial transaction. Disposition decisions should be visibly informed by blast radius, not just by feasibility.

The unit of analysis is the path, not the vulnerability. CVSS triages bugs. Path analysis triages architecture.

What goes in the disposition

A disposition that incorporates path analysis records six things per analysed path:

Path summary. Entry, escalation, action — one to three lines each, plus a link to a longer technical note where useful.
Blast radius. The four dimensions named above, summarised as a single rating with the inputs visible.
Break points considered. The candidate controls that could close the path. Not all of them — the ones that are realistic in your architecture.
Break points implemented. Which of the candidate controls are in place, with evidence references.
Residual path. If the path is not fully closed, what remains and why it is acceptable — named acceptor, conditions for acceptance, re-assessment trigger.
Re-assessment trigger. What change to the architecture would open a new path or invalidate the current closure. Adding a tool, changing the prompt design, switching the model, or modifying the authorisation model are the common ones.

The disposition is not a list of risks; it is a record of considered paths. The distinction matters because risks have natural homes in a risk register, but paths are about the architecture and belong with the design record. A useful AI security review keeps both.

See where attacks reach — not just where vulnerabilities live.

Drel maps attack paths through assessed AI systems, identifies the control points that break each path, and produces a structured analysis record.

Request early access See the demo dossier

A note on scope: Drel reviews assessed systems against documented architecture, configuration and intent. It does not ingest live telemetry from production environments. Dispositions reflect the assessed system at the time of review and the re-assessment triggers that govern when the disposition must be revisited.