Vector database security for RAG pipelines
Vector databases are infrastructure. They inherit all the access-control and injection requirements of any other data store — plus some RAG-specific ones. This piece maps the security requirements for a vector database in a production RAG pipeline.
A vector database is infrastructure. It stores data, exposes an API, has access controls, and requires the same security review as any other data store. This is easy to forget because vector databases are discussed primarily in AI and ML contexts — as embedding stores, retrieval backends, semantic search engines. But from a security standpoint, what matters is that a vector database holds content that gets delivered to an LLM, and that delivery path has specific security requirements that go beyond those of a standard database.
When infrastructure teams review a vector database as part of a RAG security assessment, they often apply only their standard database review checklist — encryption at rest, access control, backup. That covers the infrastructure baseline but misses the RAG-specific requirements: the injection surface, the retrieval-time access control requirement, the multi-tenancy isolation model, and the relationship between what is stored and what gets put in an LLM prompt.
Vector DB as infrastructure
A vector database in a RAG pipeline serves two roles simultaneously. It is a data store — it stores document text and metadata alongside embeddings — and it is a retrieval engine — it selects which stored content is relevant to a given query and returns it. The dual role matters for security because the threats to each role are different.
As a data store, the vector database inherits all infrastructure security requirements: access control for read, write, and delete operations; encryption at rest and in transit; audit logging of operations; backup and recovery capability; and API security for the management and query interfaces. These are standard and apply regardless of the AI context.
As a retrieval engine, the vector database has additional requirements that are specific to its role in the RAG pipeline. The content it returns is delivered to an LLM as trusted context. Any content that passes through the retrieval engine can influence model outputs. The access control at retrieval time is a direct input to the security posture of the RAG system overall.
A vector database that is secure as infrastructure but misconfigured for its RAG role can still enable data leakage, access control bypass, and injection. The infrastructure review and the RAG-specific review are both necessary and neither substitutes for the other.
Vector database attack surfaces
| Surface | How to test | Mitigation |
|---|---|---|
| Poisoning | Ingest adversarial documents designed to shift retrieval results for target queries. Verify whether the poisoned documents are retrieved preferentially over legitimate content after ingestion. | Write-access controls on the ingestion path. Document provenance tracking. Anomaly detection on newly ingested content. Periodic knowledge-base integrity audits. |
| Extraction | Submit systematic queries designed to reconstruct the knowledge base contents — iterating over likely content patterns to retrieve chunks from across the corpus. Measure how much of the corpus can be reconstructed. | Rate limiting on queries per user. Retrieval result diversity caps. Response output validation to detect systematic extraction patterns. Access logging with anomaly alerting. |
| Embedding inversion | Attempt to reconstruct original text from stored embedding vectors using known inversion techniques. Particularly relevant when embeddings are exposed via API. | Do not expose raw embedding vectors through customer-facing APIs. Use dimensionality reduction or perturbation to degrade invertibility. Treat stored embeddings as sensitive data at rest. |
| Access control bypass | Attempt to query the vector database directly (bypassing the RAG application layer) using the database's native API with compromised or misconfigured credentials. Attempt cross-tenant queries in multi-tenant deployments. | Network-level access controls preventing direct database access from outside the application tier. Tenant isolation enforced at the database query layer. Credential rotation and audit logging for direct database access. |
Access control requirements
Access control in a vector database has three planes: the write plane (who can insert, update, or delete embeddings and their associated metadata), the read plane (who can query and retrieve embeddings), and the management plane (who can administer collections, namespaces, and configuration).
Write plane. The write plane controls who can cause content to appear in the vector store. This maps directly to the data boundary in the RAG threat model: write access is ingestion access, and ingestion access is the ability to introduce content that will later be retrieved into LLM prompts. Write access should be restricted to authorised ingestion services, not to application users or query services.
Read plane.The read plane controls who can query the vector store. In a RAG pipeline, the query service — the retrieval layer — is the primary read-plane client. But the read plane must also enforce per-document access control at query time: the query service can retrieve content on behalf of a user, but only content within that user's access scope. This is the retrieval-time access control requirement that many RAG implementations miss.
Management plane. The management plane controls collection creation, deletion, index configuration, and administrative operations. Management plane access should be restricted to infrastructure administrators and not accessible through the application layer. Cloud-hosted vector databases with API key-based management access are particularly at risk if API keys are exposed in application code.
Injection surface
Vector databases have an injection surface that standard databases do not: the content of the embeddings and their associated metadata is delivered to an LLM. An adversary who can write to the vector store — directly or through an ingestion pipeline compromise — can inject content into the LLM's context by writing embeddings for adversarially crafted documents.
The injection surface includes both the document text stored alongside embeddings (which the retrieval layer returns to the application) and the metadata associated with each embedding. If metadata fields are returned to the application and included in prompt assembly, those fields are also injection vectors.
The review question for the injection surface is: what content can be written to the vector store, by whom, and what validation exists at write time? A vector store that accepts arbitrary text and metadata without validation is an injection surface with no gate. The gate must be at the ingestion pipeline, not at the vector store itself — but the review must verify that the gate exists and that there is no write path that bypasses it.
Data at rest
Content stored in a vector database is data at rest and must meet the same encryption requirements as any other data store containing the same classification of information. For knowledge bases that contain personal data, confidential documents, or regulated information, encryption at rest is a baseline requirement.
The specific requirements depend on the sensitivity of the stored content:
- Encryption at rest: all stored data (embeddings and document text) encrypted using AES-256 or equivalent. For managed cloud services, verify that the service encrypts at rest and that encryption is enabled in the configuration.
- Key management: encryption keys managed through a dedicated key management service, not hardcoded or stored alongside the data. Customer-managed keys (CMK) required for regulated data categories.
- Encryption in transit: all connections to the vector database use TLS 1.2 or later. No plaintext connections permitted.
- Data residency: for GDPR-regulated data, confirm that the vector database stores data within the required geographic boundaries and that replication does not cross those boundaries.
API security
Vector databases expose APIs for ingestion, querying, and management. Each API surface has distinct security requirements.
Authentication. Every API call to the vector database must be authenticated. For managed cloud services, API key authentication is standard; keys must be rotated on a defined schedule and revoked immediately when a service is decommissioned or a team member leaves. For self-hosted deployments, authentication should use short-lived credentials or service accounts, not long-lived static credentials.
Authorisation. API keys or service accounts should have the minimum scope required for their role. The ingestion service needs write access; the retrieval service needs read access; neither needs management access. Privilege separation at the API level is the simplest blast-radius control for a compromised credential.
Rate limiting. The query API should have rate limits that prevent cost-exhaustion attacks and bulk extraction attacks. An adversary who can submit unlimited queries can attempt to extract the full corpus through successive queries. Rate limits reduce the feasibility of this approach.
Network controls. The vector database API should not be publicly accessible unless the deployment requires it. For internal deployments, access should be restricted to the application subnet. For managed cloud services, VPC peering or private endpoint configurations should be used to avoid public routing.
Multi-tenancy
Multi-tenant RAG systems — where multiple organisations, departments, or user groups share a vector database — have specific isolation requirements. Namespace isolation, collection-level access control, and metadata filtering are the three mechanisms used to enforce tenant boundaries.
Namespace isolationis the coarsest and most structurally enforced mechanism. Each tenant's content is stored in a dedicated namespace or collection; queries are scoped to the tenant's namespace and cannot return content from other namespaces. Namespace isolation is enforced at the storage layer and does not depend on application-layer access control logic.
Metadata filtering provides finer-grained control within a shared namespace. Each document carries a tenant identifier in its metadata; queries include a mandatory filter on that identifier. This is application-layer enforcement — the correctness of the filter depends on the application code — and is therefore more vulnerable to implementation bugs than namespace isolation.
The security review for multi-tenant deployments must verify that cross-tenant retrieval is not possible through adversarial query construction. The test is: submit queries from Tenant A that are designed to retrieve Tenant B content, and verify that the system does not return it.
Review requirements
An AI security review of the vector database as a component of a RAG system must cover both the infrastructure baseline and the RAG-specific requirements.
- Infrastructure baseline: authentication and authorisation configuration, encryption at rest and in transit, API rate limiting, network access controls, backup and recovery capability, audit logging.
- Access control planes: write plane (ingestion access), read plane (retrieval access with per-document scope enforcement), management plane (administrative access restriction).
- Injection surface: write path audit (all paths that can cause content to appear in the store), content validation at each write path.
- Multi-tenancy isolation: namespace or collection isolation configuration, cross-tenant retrieval test results.
- Credential management: API key rotation policy, key scope configuration, key storage controls.
See the Drel RAG security assessment hub for the vector database review module with configuration checklist.
Blog
Get new posts in your inbox
AI security review, OWASP Agentic Top 10, ISO 42001 evidence, and what AI Committees actually need. No cadence promises — we publish when there's something worth reading.
Include the vector database in your RAG assessment
Drel reviews vector database security as a first-class component of every RAG assessment — covering the infrastructure baseline and the RAG-specific access control, injection, and multi-tenancy requirements.
A note on scope: Drel reviews assessed systems against documented architecture, configuration and intent. It does not ingest live telemetry from production environments. Dispositions reflect the assessed system at the time of review and the re-assessment triggers that govern when the disposition must be revisited.