AI security review vs penetration testing — different questions
A penetration test asks: can this system be exploited? An AI security review asks: should this system go to production, and under what conditions? The questions are related but not the same. Running only a pentest leaves most AI risk unaddressed.
Security teams deploying AI systems are being asked to do something they have not had to reconcile before: apply a discipline built for deterministic software (penetration testing) to a class of systems whose behaviour is probabilistic, context-dependent, and governed by training rather than code. The result is confusion about which assessment is needed, when, and what evidence each one actually produces.
Both a penetration test and an AI security review are legitimate and necessary. But they answer different questions, operate on different inputs, and produce different evidence. Conflating them leads teams to run a pentest and consider their AI security obligations discharged — leaving a wide class of risks unaddressed.
Two different questions
The core distinction is this:
- A penetration test asks: can this system be exploited? It tests whether known attack techniques succeed against the deployed system. Success means finding a path from attacker to impact.
- An AI security review asks: should this system go to production, and under what conditions? It evaluates whether the system has been designed, configured, and bounded in a way that makes it appropriate to operate at the stated risk tolerance.
These questions are related. A pentest may reveal that controls assumed to be present in the review are absent or bypassable. A review may define the attack surfaces the pentest should prioritise. But they are not interchangeable, and running one does not make the other redundant.
A penetration test that finds no exploits does not mean the system should be deployed. It means the tested attack paths did not succeed on the day they were tested. The clearance decision requires more than that.
AI security review vs penetration test — at a glance
| Question asked | Penetration test | AI security review |
|---|---|---|
| Primary question | Can this system be exploited by an active attacker? | Should this system go to production, and under what conditions? |
| When it runs | Before production or on a periodic schedule against the deployed system | Before deployment; again on model change, scope expansion, or incident |
| Output | Exploit findings, severity ratings, remediation recommendations | Disposition memo: clearance decision, control plan, residual risk acceptance, re-assessment triggers |
| Who signs off | Security team accepts or remediate findings; no formal clearance decision | Named governance authority (CISO, AI Committee) issues the clearance decision |
| Framework evidence | Supports control verification; cited in evidence pack as test evidence | Primary artefact for ISO 42001, EU AI Act Art. 9, NIST AI RMF Map/Manage |
| Scope | Deployed system: API boundary, model interface, output handling, tool calls | Architecture, threat model, control plan, data flows, deployment context fit |
What a penetration test covers
A traditional penetration test of an AI application typically covers:
- Infrastructure and API security. Authentication, authorisation, rate limiting, input validation at the API boundary, data leakage through HTTP responses.
- Model interface exploitability. Direct prompt injection attempts, jailbreak techniques against the production system, system prompt extraction attempts.
- Output handling vulnerabilities. Whether model outputs are rendered unsanitised in contexts where they can cause client-side harm (XSS via LLM output, for example).
- Tool call security. For agentic systems, whether tool invocations can be hijacked or elevated beyond intended permissions.
LLM-specific penetration testing has matured significantly. Frameworks like OWASP LLM Top 10 and the OWASP Agentic Top 10 give testers structured attack catalogs. A good LLM pentest is meaningfully different from a traditional web application pentest — it requires testers who understand prompt engineering, model behaviour, and the specific attack surfaces of the model type in use.
What an AI security review covers
An AI security review is a design-time assessment. It operates on the architecture, the configuration, the data flows, the threat model, and the control plan — not on a live instance of the running system. It produces a clearance decision: a documented judgment that the system meets (or does not meet) the threshold to operate in the stated deployment context.
The domains a review covers that a pentest does not:
- Threat model completeness. Does the threat model account for all relevant attack surfaces — including surfaces the pentest will not test, like supply chain, training data, and indirect injection through knowledge bases?
- Control coverage of the threat model. For each identified threat, is there a specified control? Does each control have an owner, a verification method, and a lifecycle gate?
- Residual risk acceptance. For risks that controls do not fully close, is there an explicit, named acceptance by a person with authority to accept it?
- Data handling and privacy review.Does the system’s data flows comply with the applicable data protection framework? Are there control gaps in how personal data is processed by the model?
- Deployment context fit. Is this system appropriate for the user population, the data classification, and the organisational risk tolerance it is being deployed into?
- Re-assessment triggers. What changes to the system or context will require a new review?
The output of a review is not a list of exploits found. It is a disposition memo that captures the clearance decision, the required controls, the accepted residual risks, and the conditions under which the clearance must be revisited. That memo is the artifact a regulator, an auditor, or a post-incident investigator will ask for.
Where they overlap
The overlap between a pentest and an AI security review is real and valuable. In practice:
- The review’s threat model informs the pentest scope. A review that identifies indirect prompt injection via retrieved documents as a key risk tells the pentest team where to focus.
- The pentest’s findings feed back into the review. A pentest that succeeds in bypassing a control listed in the disposition as “in place” is evidence that the clearance was issued on a false assumption. The disposition must be updated.
- Both contribute to the evidence pack. The review produces the clearance decision and the control plan. The pentest report is an evidence artefact that supports or challenges the control verification entries.
Running both in a single deployment cycle
For a typical AI system going to production, the combined cycle looks like this:
- Design-time review (weeks 1–3).System boundary is documented. Threat model is produced. Control plan is drafted and assigned. Residual risks are identified and accepted. Clearance decision is issued — often “restricted pilot” at this stage, with a list of controls that must be in place before full production.
- Pre-production pentest (weeks 4–5). The pentest is scoped against the threat model from the review. The pentest team is given the control plan so they know which controls are claimed to be in place. Findings feed back into the control plan as open items or as verification evidence.
- Control closure (weeks 5–6). Pentest findings that represent control gaps are remediated. The disposition is updated to reflect the closure status of each required control.
- Go-live clearance (week 6+).The disposition is updated to “cleared for production” once the required controls are verified as in place. The pentest report is attached to the evidence pack.
This sequence produces a defensible record at each stage. The pentest is not a substitute for the review, and the review does not make the pentest redundant. Each does what the other cannot.
The evidence each produces
When a governance committee, an auditor, or a regulator asks for the evidence that an AI system was properly assessed, the two disciplines contribute different artefacts:
| Artefact | Produced by | What it demonstrates |
|---|---|---|
| Disposition memo | AI security review | Clearance decision, rationale, required controls, residual risk acceptance, re-assessment triggers |
| Threat model | AI security review | Completeness of risk identification for the assessed system |
| Control plan | AI security review | Coverage of identified threats with verifiable controls |
| Pentest report | Penetration test | Exploitability of attack paths on the deployed system |
| Red team findings | Penetration test | Model-specific attack results (jailbreak, injection, extraction) |
| Control verification evidence | Both | That specified controls are actually in place |
The two evidence sets are complementary. A governance pack that contains only a pentest report shows that the deployed system was tested for known exploits. It does not show that the system was reviewed against its risk profile, that residual risks were accepted by accountable parties, or that there is a process for re-assessment when the system changes. A pack that contains both shows all of that.
The most defensible posture is a review that sets the design-time baseline, a pentest that verifies the runtime implementation of the controls, and a disposition that records both and ties them to the clearance decision.
Blog
Get new posts in your inbox
AI security review, OWASP Agentic Top 10, ISO 42001 evidence, and what AI Committees actually need. No cadence promises — we publish when there's something worth reading.
Run your AI security review before the pentest
Drel produces the design-time disposition your pentest needs as its scope document — and the evidence pack your governance committee needs after it.
A note on scope: Drel reviews assessed systems against documented architecture, configuration and intent. It does not ingest live telemetry from production environments. Dispositions reflect the assessed system at the time of review and the re-assessment triggers that govern when the disposition must be revisited.