Agent memory as an attack surface
Agents that persist memory across sessions carry forward context that can be poisoned. An attacker who controls a past interaction can plant instructions that execute in a future session. This piece maps the memory attack surface and the controls that bound it.
Stateless systems are easy to secure: each request is independent, and an attacker who influences one request gains nothing in future requests. Agentic AI systems are frequently stateful — they are designed to remember, to improve with use, and to maintain continuity across sessions. This statefulness is a product feature and a security vulnerability.
Memory poisoning is the class of attacks that exploits agentic statefulness. It is distinct from prompt injection in one critical respect: its effects persist. A successful memory poisoning attack survives session boundaries, executes against future users, and may be impossible to detect without reviewing what the agent stored and when.
Agent memory types — attack vectors and controls
| Memory type | What it stores | Attack vector | Control required |
|---|---|---|---|
| In-context window | Current session: system prompt, conversation history, tool results, retrieved content | Prompt injection via user input or retrieved content in the current session | Treat all non-system-prompt content as untrusted data; anchor goals to resist displacement by context growth |
| Persistent external memory | Cross-session data: user preferences, task history, knowledge base documents in vector/relational stores | Malicious document inserted into the store via any write path; retrieved in future sessions | Restrict write access to authorised ingestion pipelines; validate and sanitize content before storage; record provenance for every entry |
| Semantic vector store | Embeddings of documents and past interactions, queried by semantic similarity at session start or during tasks | Adversarially crafted document embedded to rank highly for target query patterns, influencing model outputs | Trust-weight retrieval ranking by source provenance; audit ingestion pipeline write access; test adversarial retrieval ranking |
| Tool call history | Log of past tool invocations, parameters, and results — sometimes surfaced to the model as prior context | Attacker-influenced tool results stored as history and retrieved as trusted context in future sessions | Validate tool results before storing as history; tag results with trust level of the source; isolate per-user tool history |
| Shared agent state | State shared between multiple agents in a multi-agent system — coordination data, task status, delegated context | Compromised worker agent writes attacker-controlled content to shared state, influencing orchestrator or peer agents | Enforce write permissions per-agent role; validate state entries before consumption; isolate shared state namespaces by trust boundary |
Three types of agent memory
Agentic systems implement memory in three distinct ways, each with different characteristics and different vulnerabilities:
- In-context memory— the conversation history and context accumulated within the current session. It exists in the model's context window and is discarded when the session ends. It includes the system prompt, all user messages, all model responses, and all tool call results.
- External memory — information stored outside the model, in vector databases, relational stores, or document stores, and retrieved by the agent via search or query. It persists across sessions and is accessible to any session that queries the store.
- Episodic memory — summaries of past sessions stored by the agent itself, used to provide context about past interactions when starting a new session. It is generated by the model (which introduces its own risks) and retrieved at session start.
Each type has a distinct attack surface, a distinct poisoning method, and distinct controls. A complete memory security review must address all three that are present in the assessed system.
In-context memory: the current-session surface
In-context memory is the most visible memory type because it is bounded by the session. Everything in the context window is available to the model for this session and gone when the session ends.
The security concern for in-context memory is prompt injection: an attacker who controls any content that enters the context window can influence the model's behavior for the duration of the session. This includes direct user input, tool return values, retrieved documents, and any other content the agent reads.
In-context memory also accumulates across the session. Early in a session, the model might correctly follow the system prompt's constraints. As the context grows — with retrieved content, tool results, and multi-turn conversation — the model's effective reasoning can shift. Injected content early in the session remains in context and can influence decisions made much later in the same session.
Controls for in-context memory poisoning: treat all non-system-prompt content as untrusted data, not as instructions; implement goal anchoring in the system prompt that resists displacement by context growth; and for long-running agents, periodically re-evaluate whether the agent's current behavior is consistent with its original goal.
External memory: the persistent store surface
External memory — vector stores, document databases, relational stores used for agent context — is what allows an agent to access knowledge that would not fit in a single context window. It is queried at session start or during the session to retrieve relevant context, which is then added to the model's context window.
The attack surface is the write path to external memory. An attacker who can insert content into the external memory store can influence any future session that retrieves that content. This does not require compromising the agent or the agent's infrastructure — it requires compromising whatever process writes to the memory store.
In RAG-style systems, the external memory store is typically populated from documents: internal wikis, email archives, customer records, support tickets. Any process that can write to these source documents can indirectly write to the agent's memory. This is a larger attack surface than teams typically account for.
Controls for external memory: restrict write access to the ingestion pipeline to authorized sources only; validate and sanitize content before ingestion; implement source attribution so the model knows the trust level of retrieved documents; and review the full ingestion pipeline as part of the memory security assessment.
Episodic memory: the cross-session persistence surface
Episodic memory is the most dangerous memory type from a security perspective, because it combines the characteristics of external memory (persistence across sessions) with the characteristics of model-generated content (the memory was written by the model, not by a controlled ingestion pipeline).
When an agent creates an episodic memory entry — summarising a past session and storing it for retrieval — the model is writing to a persistent store. If the session being summarised contained injected content, that content may be incorporated into the summary. The injected instruction survives the session in the memory store and is retrieved in future sessions as context.
Episodic memory poisoning is the agentic equivalent of a stored cross-site scripting attack. The payload is stored in a trusted location — the agent's own memory — and executes in future sessions without requiring ongoing attacker access. The injection happens once; the effect is persistent.
The mechanism: in session one, an attacker provides a carefully crafted input that contains an instruction formatted to survive summarization. The agent processes the session and generates a summary. The summary includes the instruction in a form the model will recognize and act on. In session two — with a different user — the summary is retrieved as context, and the instruction executes in session two's context.
Controls for episodic memory: validate episodic memory entries before storage (do they look like summaries of user interactions, or do they contain instruction-like content?); implement signed or provenance-tracked memory entries that distinguish model-generated summaries from injected content; and apply the same untrusted-input treatment to retrieved episodic memories that you apply to retrieved documents.
How poisoning attacks unfold
The general pattern of a memory poisoning attack has three phases: injection, storage, and execution.
Injection phase: The attacker introduces malicious content into the system through whatever channel is available — user input, a document ingested into the knowledge base, a web page retrieved by the agent, or a forged tool return value. The content is crafted to survive processing and be stored in memory.
Storage phase: The agent processes the malicious content and, as a side effect, stores some form of it in memory. For external memory, this is direct: the malicious document is indexed into the vector store. For episodic memory, this requires the model to summarise the malicious content in a way that preserves the injected instruction.
Execution phase:A future session retrieves the stored content — because it is relevant to the new session's query — and the model acts on the injected instruction in the new session's context. The new session has no awareness that the content it retrieved was attacker-influenced.
The gap between the injection and execution phases is what makes memory poisoning attacks difficult to detect. By the time the injected instruction executes, the original session that introduced it may be long gone. Without a complete audit trail that links retrieved content to its source and tracks what was stored and when, attributing the behavior to a memory poisoning attack is very difficult.
Session isolation as a control
Session isolation is the primary structural control for memory-based attacks. It means that memory accessible in one session is not accessible to other sessions — the memory is scoped to the specific user, session, or context that created it.
Full session isolation prevents the execution phase of a memory poisoning attack from affecting other users: even if the attacker's session plants something in memory, that memory is not accessible to other sessions. It does not prevent the injection or storage phases, and it does not prevent a re-attack in the same user's future sessions.
In practice, most agentic systems implement partial session isolation: some memory is session-scoped (in-context), some is user-scoped (episodic memories associated with a specific user), and some is system-wide (shared knowledge base, shared vector store). The review must identify which memory type has which scope and assess the controls appropriate for each.
For system-wide shared memory, session isolation is not available as a control — the memory is intentionally shared. The controls here shift to: strict ingestion validation, source attribution and trust tagging, and review of the write access controls on the ingestion pipeline.
Controls for agent memory security
A complete set of memory security controls covers all three phases of a poisoning attack:
Injection prevention:
- Treat all non-system content as untrusted data, regardless of source
- Implement goal anchoring that resists displacement by injected content
- Restrict write access to external memory stores to authorized ingestion pipelines only
- Validate and sanitize content before ingestion into external memory
Storage prevention:
- Validate episodic memory entries before storage — flag entries that contain instruction-like patterns
- Implement human review for episodic memory entries that deviate from the expected summary format
- Track the provenance of all stored memory entries (source session, source document, storage timestamp)
Execution mitigation:
- Tag retrieved memory with trust level based on source; treat retrieved documents as lower-trust than system prompts
- Implement session isolation for user-specific memory
- Build an audit trail that links agent actions to the memory content that influenced them, enabling post-incident analysis
Review evidence requirements
The memory security review must produce the following evidence for the assessed system:
- Memory architecture map — which memory types are present, what is stored in each, how each is populated, and how each is accessed
- Session isolation scope — what memory is session-scoped, user-scoped, or system-wide; and the access controls for each scope
- Ingestion pipeline review — the write path for external memory stores, the access controls, and the validation steps applied before storage
- Episodic memory review — whether episodic memory is present, how entries are generated and stored, and what validation occurs before storage
- Control gap assessment — which controls from the control set above are present, which are absent, and the residual risk from absent controls
This evidence supports the memory security section of the agentic AI risk disposition. For the full review framework, see the agentic AI security review hub.
Blog
Get new posts in your inbox
AI security review, OWASP Agentic Top 10, ISO 42001 evidence, and what AI Committees actually need. No cadence promises — we publish when there's something worth reading.
Review your agent's memory architecture before deployment
Drel structures the memory security review for agentic AI systems — covering all three memory types, their poisoning paths, and the controls that bound them — as part of the design-time assessment.
A note on scope: Drel reviews assessed systems against documented architecture, configuration and intent. It does not ingest live telemetry from production environments. Dispositions reflect the assessed system at the time of review and the re-assessment triggers that govern when the disposition must be revisited.