Access control for RAG — keeping retrieval inside the line
RAG pipelines retrieve documents and pass them into a model context that the user then queries. Access control must operate at retrieval time, not just at query time — or users can extract documents they would not be permitted to read directly.
Access control for a RAG system is harder than it appears. The surface pattern is familiar — check whether the user has permission before serving content — but the implementation has a specific property that breaks most naive approaches: the point at which content is retrieved is not the same as the point at which the user submits their query. If access control is only evaluated at query submission, it does not protect the retrieval boundary.
The result, in assessed systems, is a consistent pattern: the application correctly refuses unauthorised users at the query interface, but a user who is authorised to query the system can craft queries that retrieve documents they are not authorised to read. The authorisation check was at the wrong point.
RAG access control levels — what each controls and the failure mode if missing
| Level | What it controls | Failure mode if missing |
|---|---|---|
| KB-level access | Who can add documents to the knowledge base — direct upload, ingestion pipeline write access, and any automated process that writes to the store | Any authenticated user or compromised pipeline can insert adversarially crafted documents; no limit on what enters the retrieval corpus |
| Document-level | Which documents a given user or role is permitted to retrieve — enforced as a pre-retrieval filter at the vector database layer, not as post-retrieval redaction | An authorised query user can retrieve documents outside their clearance scope; confidential content appears in model responses for users who should not see it |
| Field-level | Which fields within a document a given user can receive — relevant when documents contain mixed-sensitivity content (e.g., a contract with both public terms and confidential pricing) | Entire document is returned even when only certain fields are in scope; sensitive fields appear in model context and can be surfaced in generated responses |
| Query-scope | What query patterns a given role is permitted to issue — prevents roles with limited scope from constructing queries designed to enumerate or extract out-of-scope content | Authorised users can craft adversarial queries to probe document existence or incrementally extract restricted content through repeated queries |
| Output-level | What the model can return given the querying role — a final guardrail preventing the model from synthesising restricted content into a response even if it appears in retrieved context | Even with upstream access controls, retrieved content from edge cases can appear in model output; no output-layer check catches content that slipped through the retrieval filter |
The access control gap
The access control gap in RAG is the space between two enforcement points: query submission and document retrieval. In a non-RAG system, these are the same operation — a user requests a resource and the system checks whether they can have it. In a RAG system, the user's query is not a direct request for a document. It is a natural-language question that causes the retrieval mechanism to select documents the user did not explicitly name.
A user who can query “what is our policy on executive compensation?” is implicitly requesting whatever document ranks highest for that query — even if that document is a confidential board memo they are not cleared to read. If access control is only checked at query submission (“can this user query the system?”), it will pass. If access control is also checked at retrieval (“can this user read the documents being returned for this query?”), it will fail.
Query-time access control answers the wrong question. It asks “can this user submit a query?” Retrieval-time access control answers the right question: “can this user read the specific documents this query will return?”
Retrieval-time vs query-time access control
Query-time access control is easy to implement and commonly present. It typically takes the form of authentication gates on the query endpoint — only authenticated users can submit queries, and the authentication policy enforces session validity and rate limits. This control is necessary but not sufficient.
Retrieval-time access control must operate at the vector database query. When the retrieval mechanism issues its similarity search, it must filter results to documents within the querying user's authorisation scope. The user's identity and access attributes must be propagated from the query interface through the retrieval pipeline to the vector database query itself.
This requires three capabilities that many RAG implementations do not have by default:
- Identity propagation:the user's identity and access attributes must be passed to the retrieval layer as part of every query. If the retrieval layer does not receive the user's identity, it cannot enforce per-user access control.
- Document access metadata: each document in the knowledge base must carry access control metadata — which users, groups, or roles are permitted to read it. This metadata must be stored in a form the vector database can filter on at query time.
- Filter-at-retrieval capability:the vector database must support metadata filtering that restricts similarity search results to documents within the querying user's scope. This must be an unconditional filter, not a post-processing step that happens after results are returned.
Attribute-based access control in vector databases
Attribute-based access control (ABAC) in a RAG context means attaching access attributes to each document in the knowledge base and enforcing those attributes at retrieval time. The attributes can express simple ownership (“only members of this group”), classification levels (“restricted, confidential, public”), or complex policy (“users in these roles, in this business unit, with this clearance level”).
Most production vector databases — Pinecone, Weaviate, Qdrant, pgvector — support metadata filtering at query time. The retrieval query can include a filter expression that restricts results to documents whose stored metadata satisfies a predicate. For ABAC, this filter is constructed from the user's access attributes and evaluated at the vector database layer.
The critical implementation requirement: the access filter must be applied before the similarity search returns results, not after. Post-retrieval filtering — retrieving all top-k results and then discarding the ones the user cannot see — exposes document existence information (the user can infer that documents exist even if the content is withheld) and may expose content if the filtering logic has errors.
Row-level security patterns
Row-level security (RLS) is the database concept that maps directly to retrieval-time access control in RAG. In a relational database, RLS attaches security policies to tables so that queries automatically filter rows based on the executing session's identity. The application does not need to add WHERE clauses for access control — the database enforces it transparently.
For RAG systems using pgvector or other relational vector stores, RLS policies can be applied directly to the vector table. Each embedding row carries an access control column; the RLS policy restricts which rows the querying session can see. The similarity search automatically operates over only the rows the session is permitted to access.
For non-relational vector stores, the equivalent pattern is namespace isolation combined with per-namespace access control. Each user's or group's document set lives in a named namespace; the retrieval query is scoped to the namespaces the user is authorised to access. Namespace isolation is coarser than row-level control but is structurally enforced at the storage layer.
The extraction path
Testing that access control holds at retrieval requires an adversarial approach: construct queries that attempt to extract out-of-scope content and verify what the system returns. The extraction path test is a structured set of such queries.
The basic extraction test: create a document in the knowledge base with a classification that the test user should not be able to access. Submit queries designed to retrieve that document — including direct queries for the document's content, queries that embed key phrases from the document, and queries that ask about the topic the document covers. Verify that none of these queries return the protected document's content in the response.
The advanced extraction test: craft queries that attempt to extract document content incrementally — asking for specific facts, structural information, or partial content that might appear in a restricted document even if the document itself is not returned. This tests whether the access control applies to content at the semantic level or only at the document level.
The multi-tenant extraction test: in systems with multiple tenants sharing infrastructure, verify that a user in Tenant A cannot retrieve documents belonging to Tenant B, regardless of namespace or query construction.
Implementation patterns
The implementation pattern that consistently produces robust retrieval-time access control has four components:
- Access metadata schema. Define the access attributes for the knowledge base before ingesting content. Every document carries owner, group, classification level, and any additional access predicates. The schema is defined upfront, not retrofitted.
- Ingestion-time tagging. Access metadata is assigned at ingestion, from the document source record. Documents from the executive document store are tagged with executive access; documents from the public KB are tagged with all-users access. Tagging is automatic, based on source, not manual.
- Query-time filter construction.The retrieval layer constructs the access filter from the authenticated user's identity and group memberships at the time the query is issued. The filter is passed to the vector database as a mandatory pre-retrieval constraint.
- Audit logging. Every retrieval event is logged with the user identity, the query, the access filter applied, and the documents returned. This log is the evidence base for access control verification in a security review.
Review checklist
A RAG access control review covers the following, with evidence for each:
- Access control metadata schema — what attributes are stored with each document.
- Ingestion-time tagging mechanism — how access metadata is assigned at ingestion.
- Retrieval filter implementation — how and where the access filter is applied (pre- or post-retrieval).
- Identity propagation — how the querying user's identity reaches the retrieval layer.
- Multi-tenancy isolation — how tenant namespaces are isolated.
- Extraction test results — what adversarial queries were run and what was returned.
- Audit log configuration — what retrieval events are logged and how the logs are protected.
See the Drel RAG security assessment hub for the full review framework, including the access control module with extraction test templates.
Blog
Get new posts in your inbox
AI security review, OWASP Agentic Top 10, ISO 42001 evidence, and what AI Committees actually need. No cadence promises — we publish when there's something worth reading.
Close the retrieval access control gap
Drel structures RAG access control review across ingestion tagging, retrieval-time filter enforcement, and adversarial extraction testing — producing evidence that holds up in a compliance audit.
A note on scope: Drel reviews assessed systems against documented architecture, configuration and intent. It does not ingest live telemetry from production environments. Dispositions reflect the assessed system at the time of review and the re-assessment triggers that govern when the disposition must be revisited.