BlogFoundations

Five mistakes that make an AI security review undefensible

Most AI security reviews fail not because they miss threats, but because they miss the structure that makes a decision defensible. These five mistakes appear in almost every review we have examined.

Drel Research10 min read

Most AI security reviews that fail do not fail because they missed the right threat. They fail because they have structural problems that make the review output undefensible regardless of the threats identified. A review with a complete threat model but controls that have no verification methods is not safer than one with fewer threats and verified controls. It is just more words on a page.

The five mistakes in this piece appear consistently in reviews we have examined. They are not exotic edge cases. They are the default outcome when a team runs a review without a clear structural model of what the review is supposed to produce. Each mistake is described with the context in which it appears, why it matters, and the specific fix.

Why reviews fail structurally

A review can fail in two ways: it can fail to identify the right risks, or it can fail to produce a defensible record. The first failure is a knowledge problem — the reviewers did not know about a particular attack class or did not understand the system well enough. The second failure is a structural problem — the review produced output that cannot be used to support a clearance decision or to defend it afterwards.

Structural failures are more common and more consequential than knowledge failures. Knowledge gaps can be addressed by adding expertise. Structural gaps mean the review did not produce what it was supposed to produce — and often, neither the review team nor the governance committee noticed until the record was put under pressure.

A review that identifies every threat correctly but cannot be defended by the person who signed it is a structural failure. The goal of the review is a defensible clearance decision, not a comprehensive risk catalogue.

Five structural mistakes — why they happen and what a regulator sees

MistakeWhy it happensWhat a regulator seesThe fix
Scope too wide to completeDefining scope as 'all AI systems' or describing the system so broadly the boundary has no natural end — feels thorough but has no completion criterionNo clearance decision ever issued; system went to production without a defensible governance recordProduce a one-page scope document before the review begins — system boundary, deployment context, clearance decision it must produce. Agreed in writing.
Threat-modelling an incomplete systemReview starts before architecture is fixed; tool manifest, retrieval source, or data access are still open. Threat model diverges from the system actually deployed.Clearance was issued against a different system than the one operating in productionEnforce a 'review-ready' gate: boundary fixed, model chosen, data flows documented, tool manifest finalised before the first threat-modelling session
Controls without verification methodsControl plan lists controls as 'implemented' or 'planned' with no method for confirming that assessment. Status is based on intent, not evidence.'Was the control verified?' — 'We implemented it.' — 'How do you know?' — No answer. Does not survive scrutiny.Every control must have a specific verification method written at the time the control is specified, producing a binary outcome when applied
Risk acceptance without a named acceptorDisposition says risks are 'accepted by the team' or residual risk section is absent because 'there are no significant residual risks'When the risk materialises: 'Who accepted this?' — 'Everyone in the meeting.' Diffused accountability equals no accountability.Every named residual risk needs: specific person, named role, the condition under which the acceptance holds. No anonymous or collective acceptances.
No re-assessment triggers registeredDisposition issued with no trigger framework. System evolves — model update, new tools, scope expansion — without any of these changes firing a re-review.Clearance record describes a substantially different system from the one currently operating. Point-in-time snapshot masquerading as ongoing assurance.Every disposition must include specific, unambiguous triggers agreed at review time — covering model change, scope expansion, incident, and system-specific conditions

Mistake 1: Scope too wide to complete

What it looks like:The scope includes “all AI systems in the organisation”, “the entire AI strategy”, or a specific system described so broadly that the reviewers cannot agree on what is inside the boundary. The threat model becomes a horizon-scanning exercise. Nothing gets to the control plan because there is always another surface to examine.

Why it matters: A review that does not complete does not produce a clearance decision. The system either goes to production without a clearance (defeating the purpose of the review) or is delayed indefinitely (creating a delivery conflict that eventually overrides the process). Neither is acceptable.

The scope problem often appears because defining a tight boundary feels like it is “missing risks” outside the boundary. It is not. Risks outside the boundary are either covered by a separate review (and that reference must be documented) or they are explicitly accepted as out of scope for this review (which must also be documented). The boundary is not a statement that outside risks do not exist. It is a statement about what this review is authorising.

The fix: Before the review begins, produce a one-page scope document that names the system boundary, the deployment context, and the clearance decision the review is meant to produce. Get it agreed in writing. If a risk appears during the review that is outside the agreed boundary, note it as out-of-scope with a recommended follow-on action — do not expand the scope mid-review.

Mistake 2: Threat-modelling an incomplete system

What it looks like: The review begins before the system is fully specified. Key questions are still open: which retrieval source will be used, whether the model will have tool-call capabilities, what data the knowledge base will contain. The threat model is produced against a notional system, not the actual system. By the time the system is finalised, the threat model has diverged from it.

Why it matters: A threat model is only as good as the system description it is based on. A threat model produced against an incomplete system description will systematically miss threats that only apply to the complete system. If the model is given tool-call capabilities that were not specified at review time, the entire agentic attack surface — tool-call hijacking, blast radius assessment, permission escalation paths — was never evaluated. The clearance was issued against a different system than the one deployed.

This is a particular problem in agile delivery environments where the system evolves during the sprint cycle that also contains the security review. The review is scheduled, the system changes, and the review is completed against a version of the system that is already out of date.

The fix:Establish a “review-ready” gate. The system must be sufficiently specified to support a complete threat model before the review starts. What counts as sufficiently specified: the system boundary is fixed, the model and its configuration are chosen, the data flows are documented, and the tool manifest (for agentic systems) is finalised. If the system changes after the review, the change must be evaluated against the trigger criteria — and if it meets a trigger, a new review is required.

Mistake 3: Controls without verification methods

What it looks like:The control plan lists controls — “access control will be implemented”, “human approval boundary will be enforced” — but none of them specify how they will be verified. The status of each control is “implemented” or “planned” with no method for confirming that assessment.

Why it matters:A control without a verification method is a promise, not a control. When the clearance decision references the control plan as the basis for the clearance, it is effectively saying “we believe these things are true” rather than “we have confirmed these things are true.” Under audit, this distinction is decisive. “Was the control verified?” — “We implemented it.” — “How do you know?” — “We said so in the control plan.” This does not survive scrutiny.

The verification gap is also a practical problem. Controls that are not verified frequently are not in place. Teams mark them as implemented based on intent rather than evidence. The gap becomes visible only when an attacker or an incident reveals that the assumed control was never operational.

The fix:Every control in the control plan must have a verification method that is specific enough to produce a binary outcome. “Code review confirms no retrieval outside authenticated user session: see PR #1234” is a verification method. “Access control is implemented” is not. The verification method should be specified at the time the control is written, not at the time it is verified — because specifying it forces clarity about what “implemented” means for this control.

Mistake 4: Accepting risk without naming the acceptor

What it looks like:The disposition record says residual risks are “accepted”, or “accepted by the team”, or “accepted subject to the controls being in place” — without naming a specific person with a specific role. Alternatively, the residual risk section is simply absent because “there are no significant residual risks” — which is almost never true for a real AI system in production.

Why it matters:Risk acceptance is a governance act. It requires a person with the authority to accept the risk and the accountability to be named as having done so. “The team accepted it” is diffused accountability — no single person is accountable, so no one is accountable. When the risk materialises, the question “who accepted this?” will be asked. If the answer is “everyone in the meeting”, the governance record has failed.

Claiming no significant residual risks is a related failure. Every AI system in production has residual risks — threats that are mitigated but not eliminated. Stating that there are none either means the threat model was not thorough enough to find them, or that the control plan claims to eliminate risks it does not actually eliminate. Both are problems.

The fix:Every named residual risk in the disposition must have a named acceptor: a specific person, identified by name and role, who has reviewed the risk description and confirmed acceptance. The acceptance must include the condition under which it holds — because acceptances are conditional. “I accept this risk as long as the human approval boundary remains in place at the model gateway” is a conditional acceptance. When that condition is no longer met, the acceptance is voided and a re-assessment is triggered.

Mistake 5: Not defining re-assessment triggers

What it looks like: The disposition is issued and the review is considered complete. No triggers are registered. The system evolves over the following months — model updated, new tools added, scope expanded — without any of these changes being evaluated against the original clearance. The disposition becomes increasingly stale until an audit, a procurement question, or an incident reveals that the clearance record describes a substantially different system from the one currently operating.

Why it matters: A clearance decision with no triggers is a point-in-time snapshot masquerading as an ongoing assurance. The governance committee that issued it understood the system as it was at the time of review. As the system changes, that understanding becomes inaccurate. The clearance is still on file, apparently valid, while the basis for it has eroded.

This is the most common gap in organisations that have conducted initial AI security reviews. The initial clearance exists. The trigger framework does not. Over 12 months, the system has changed in ways that individually seemed minor but collectively represent a materially different risk profile.

The fix: Every disposition must include a set of re-assessment triggers agreed at the time of review. Triggers must be specific enough to be unambiguous when they fire. They should cover the four standard trigger categories: model change, scope expansion, incident, and any system-specific conditions identified during the review. Register who is responsible for evaluating trigger events and updating the governance record when they occur.

The pattern they share

These five mistakes share a common structure: each one produces a review that looks complete from the outside but cannot be defended from the inside. A scope so wide nothing finishes. A threat model based on an incomplete system. Controls that cannot be verified. Risk acceptances with no named acceptor. A clearance with no mechanism for self-correction.

The pattern is: the review produces the appearance of governance without the substance. It creates a document that can be pointed to as evidence of a process, but that collapses when examined by someone who knows what to look for.

The test for any review artefact is not “does this exist?” but “could the person who signed this defend it in a meeting with someone who has the incentive to find gaps?” Apply that test to each artefact and the structural failures become visible immediately.

Fixing a review that has these gaps

If your existing AI security reviews have one or more of these gaps, the fix does not require starting over. Each gap has a specific remediation:

  • Wide scope:Produce a scope document that bounds the existing review to a specific system and deployment context. Document what was excluded and why. This does not require a new review — it requires a scope document that makes the existing review’s boundary explicit.
  • Incomplete system description: Update the system description to match the current system. Evaluate whether any elements of the current system were not present at review time. If they were not, assess whether they would have changed the threat model. If yes, run a targeted supplementary review for those elements.
  • Controls without verification: For each control, add a verification method and conduct the verification. This can be done control by control without reopening the whole review. Update the control plan entries with the verification results.
  • Unnamed risk acceptors: Return the residual risk section to the relevant governance stakeholders and obtain named acceptances. This is a governance conversation, not a technical one.
  • Missing triggers: Add triggers to the disposition record now. Review the changes that have occurred since the initial review and assess whether any of them would have fired the triggers you are now registering. If any significant changes were made since the clearance, consider whether a supplementary review is warranted.

The goal of remediation is not a perfect review record. It is a defensible one — a record that an informed reviewer can read and conclude that the right questions were asked, answered honestly, and recorded with enough specificity to verify. That is the standard an AI security review must meet to be worth having.

Blog

Get new posts in your inbox

AI security review, OWASP Agentic Top 10, ISO 42001 evidence, and what AI Committees actually need. No cadence promises — we publish when there's something worth reading.

Structure your AI security review to be defensible from the start

Drel enforces the structural requirements that make a review defensible: bounded scope, verified controls, named risk acceptors, and registered re-assessment triggers — built into every assessment.

A note on scope: Drel reviews assessed systems against documented architecture, configuration and intent. It does not ingest live telemetry from production environments. Dispositions reflect the assessed system at the time of review and the re-assessment triggers that govern when the disposition must be revisited.