PII leakage through RAG retrieval
RAG pipelines built over internal document corpora frequently contain personal data that was never intended to be queryable by the model. PII leakage through retrieval is the most common data-protection issue we encounter in RAG security reviews.
Personal data in RAG knowledge bases is usually there accidentally. No one planned to make HR records queryable through the AI assistant. No one intended the sales CRM export to end up in the document corpus. But internal document corpora — the typical source for enterprise RAG knowledge bases — contain personal data pervasively: in support tickets, in email exports, in meeting notes, in internal memos that mention employees by name and situation. When those documents are ingested without classification, the PII they contain becomes queryable through the retrieval interface.
PII leakage through RAG retrieval is the most common data-protection finding in RAG security reviews. It is not a sophisticated attack. It does not require an adversary. A legitimate user asking a legitimate question may receive a response grounded in a document containing another person's personal data — and neither the user nor the system has any mechanism to signal that this has occurred.
How PII enters RAG
Personal data enters RAG knowledge bases through several routes, most of which are unintentional:
Bulk export from internal systems. The fastest way to populate a knowledge base is to export documents from existing internal repositories — SharePoint, Confluence, Google Drive, a shared network drive. These exports rarely include data classification. They include everything the repository contains, which for most organisations means a mix of public, internal, confidential, and personal data in undifferentiated form.
Support ticket and case ingestion. Customer support and internal service desk tickets are a rich source of domain knowledge — exactly the kind of content that makes a RAG system useful for support agents. They are also a dense source of personal data: customer names, contact details, account information, problem descriptions that may contain sensitive personal context.
Meeting notes and communication archives. AI-generated meeting summaries, email thread exports, and Slack/Teams archives contain personal data in the form of names, roles, and discussed topics — including sensitive HR, financial, and personal matters that were discussed in those forums.
Structured data converted to text. Some RAG pipelines ingest structured data — spreadsheets, database exports — converted to plain text. Employee records, customer lists, and financial data converted to text land in the knowledge base as queryable content.
PII leakage pathways in a RAG system
Retrieval leak
A user query retrieves chunks containing personal data from the knowledge base — because the data was ingested without classification and the retrieval scope is not restricted by the user's access rights. The personal data reaches the model's context window.
Control
Pre-ingestion data classification; access-controlled retrieval scoped to the querying user's permissions; chunk-level metadata tagging to enable filtering.
Output leak
The model includes retrieved personal data in its generated response. Even when retrieval is controlled, the generation layer may quote or paraphrase PII from context chunks that were legitimately retrieved for one purpose and surfaced in the output for another.
Control
Output validation to detect and redact PII patterns (names, IDs, contact details) before the response is returned to the user; response scoping instructions in the system prompt.
Log leak
Inference logs — prompts, retrieved chunks, and completions — are retained and accessible to parties who do not have permission to access the underlying personal data. The log becomes a secondary PII store.
Control
PII redaction before log storage; log access controls aligned to the data classification of the knowledge base; defined log retention periods with verified deletion.
The leakage mechanism
The leakage mechanism is the retrieval itself. A user submits a query; the retrieval mechanism returns the documents most semantically similar to that query; those documents are passed to the model as context; the model synthesises a response that may incorporate personal data from those documents. No step in this chain flags the personal data. The retriever treats PII the same as any other text. The model synthesises from it without categorising it. The output validation, if present, may check for certain PII patterns but is unlikely to catch the full range of personal data present in a corpus.
The leakage takes two forms:
- Direct reproduction:the model quotes or closely paraphrases the personal data from the retrieved document — citing a person's name, medical condition, salary, or contact details in its response.
- Inferred disclosure: the model synthesises from multiple retrieved documents in a way that discloses personal information without quoting any single source — combining information from different documents to reveal associations the data subject would not expect.
PII leakage through RAG is not primarily a threat — it is a design consequence. Personal data ingested without classification becomes queryable personal data. The control is at the data boundary, before ingestion, not at the output layer after the fact.
The data classification gap
The root cause of PII leakage in RAG is almost always a data classification gap at the ingestion stage. Documents enter the knowledge base without a classification label that identifies what personal data they contain, whose data it is, what processing basis applies, and what access restrictions govern the document.
Without classification, the knowledge base cannot enforce differential access based on data type. It cannot prevent a document containing employee medical data from being returned alongside documents containing public policy text in response to a query about absence management. It cannot identify which documents are subject to data retention obligations that may have already expired.
The classification gap is compounded by the fact that many of the documents that contain personal data are not themselves classified as “personal data documents” in the source systems. A support ticket containing a customer's name and health information may be classified as “internal” in the ticketing system. When it is ingested into the knowledge base, it retains that classification — which says nothing about its personal data content.
GDPR implications
Under GDPR, personal data processing requires a lawful basis, a purpose limitation, and a data minimisation obligation. RAG over unclassified document corpora implicates all three.
Lawful basis. Processing personal data — including making it retrievable through an AI query interface — requires a lawful basis for each category of data subject and each processing purpose. The lawful basis for ingesting an HR record into a customer support RAG system is not the same as the lawful basis for processing it in the original HR context. Organisations typically have not assessed the lawful basis for the secondary processing RAG represents.
Purpose limitation. Personal data collected for one purpose cannot be processed for an incompatible purpose without a new lawful basis. Making personal data from support tickets queryable in a sales assistance RAG system is a purpose change that GDPR requires to be assessed.
Data minimisation.Personal data in the knowledge base should be limited to what is necessary for the RAG system's specific purpose. Bulk ingestion of internal document corpora systematically violates data minimisation unless the corpus has been scoped to exclude personal data that is not required for the system's purpose.
Data subject rights. If personal data is stored in a knowledge base, it is in scope for data subject access requests, erasure requests, and rectification requests. Organisations must be able to identify, retrieve, and delete personal data from the knowledge base in response to these requests — a capability most RAG implementations do not include.
Controls
The controls for PII leakage in RAG operate at three layers: ingestion, retrieval, and output.
Ingestion layer. PII detection at ingestion: every document passes through automated PII detection before entering the knowledge base. Detected PII is flagged for classification — the document is either rejected from ingestion, ingested with restricted access labels, or has the PII removed or pseudonymised before ingestion. Data classification schema: a defined classification scheme that maps document types to personal data categories and access restrictions.
Retrieval layer.Access control based on data classification: documents containing personal data are only retrieved for users with a purpose-compatible access scope. Documents classified as employee-personal are only retrievable by HR system users; documents classified as customer-personal are only retrievable by authorised support users. Scope limiting: the knowledge base for each RAG deployment is scoped to the documents necessary for that deployment's purpose — not the full organisational document corpus.
Output layer. PII redaction in responses: automated redaction of personal data categories from model responses where the response purpose does not require personal data. Citation transparency: responses include citations to source documents, allowing auditors to verify that the response grounding is within scope.
DPIA requirements
A Data Protection Impact Assessment is required under GDPR Article 35 when processing is likely to result in a high risk to the rights and freedoms of natural persons. RAG over personal data almost always meets this threshold. The DPIA for a RAG system over personal data must cover:
- Description of processing. What personal data categories are in scope, whose data is processed, what the RAG system does with it, and who can query the system.
- Necessity and proportionality.Why personal data is necessary for the RAG system's purpose, and what data minimisation measures are applied.
- Risk assessment. The specific risks of RAG processing of personal data — including leakage through retrieval, purpose limitation violations, and inability to fulfil data subject rights — with likelihood and severity assessment.
- Mitigation measures. The controls applied at ingestion, retrieval, and output layers, with evidence that they are implemented and effective.
- Residual risk acceptance. The residual risk after controls, accepted by the data controller with named accountability.
The DPIA must be completed before the RAG system goes into operation with personal data in scope — not retroactively.
Review checklist
A RAG PII leakage review covers:
- Data classification schema — what classification labels are in use and what personal data categories they cover.
- Ingestion PII detection — what automated detection runs at ingestion, what it catches, and what happens to flagged documents.
- Scope of knowledge base per deployment — is the corpus scoped to necessary documents or is it a bulk export?
- Retrieval access control for personal data — how personal-data-bearing documents are restricted to purpose-compatible query scope.
- Data subject rights capability — can personal data be identified, retrieved, and deleted from the knowledge base in response to a subject access or erasure request?
- DPIA completion status — is the DPIA complete, current, and signed off by the DPO?
- Output PII validation — what checks run on model responses to prevent personal data from appearing in out-of-scope responses?
See the Drel RAG security assessment hub for the full PII leakage review module with DPIA template.
Blog
Get new posts in your inbox
AI security review, OWASP Agentic Top 10, ISO 42001 evidence, and what AI Committees actually need. No cadence promises — we publish when there's something worth reading.
Address PII leakage in your RAG assessment
Drel includes PII leakage as a named threat in every RAG security assessment — covering ingestion classification, retrieval access control, GDPR obligations, and DPIA requirements.
A note on scope: Drel reviews assessed systems against documented architecture, configuration and intent. It does not ingest live telemetry from production environments. Dispositions reflect the assessed system at the time of review and the re-assessment triggers that govern when the disposition must be revisited.