RAG and Secure Retrieval

Imagine LLMs that are able to adapt instantly to new information, rather than being stuck to a predetermined scope of what they can comprehend or contextualize. That is the promise of Retrieval Augmented Generation (RAG). It is an important piece to the LLM value chain that effectively shifts LLMs from being static tools to dynamic tools grounded into your company’s ever-expanding data. Traditional methods, like fine-tuning LLMs, involve expensive, time-consuming updates to keep models relevant. Yet, these models often struggle to tap into data locked away in APIs, databases, or company documents. Fine-tuning is like adjusting the sails on a boat after a long voyage, but what if the weather has already changed? That’s where RAG comes in - it’s like giving the boat real-time navigation, constantly pulling in new data to steer accurately.

As opposed to tweaking the model itself, RAG fetches relevant information from trusted sources, folding it into the conversation to produce more precise, meaningful responses. This not only makes the process of keeping your LLM application relevant, faster and more cost-effective, it also ensures that the outputs are grounded in up-to-date information, improving its relevance and trustworthiness to the end-user.

Even though the sources for RAG are deemed trusted, the question of how to assure whether the user of the LLM is permitted to see the data retrieved into the application is an entirely different question that we have been asked several times. In our view, it’s not a matter of whether vector databases or any other storage modality can implement robust access controls - it’s instead about when and how. Nevertheless, RAG has become closely linked with the rise of vector databases, which is a collection of data stored as mathematical representations. The rise of vector databases has largely been due to their ability to store and retrieve unstructured data - like text, images, or embeddings - with remarkable speed and relevance. Besides that, they excel in semantic search, leveraging embeddings to find information based on meaning, not just keywords, allowing RAG to surface the most relevant data. However, this rise in popularity has also sparked misconceptions about security, with some assuming that securing vector databases is fundamentally different from securing other types of data storage."

In this article, we’ll go over the misconceptions that we have heard and seen as a result of the LLM hype and a generally rapidly moving space.

A step back into reality

The question around RAG security has been clouded by misguided assumptions. These misconceptions are largely by industry players who may not have the full picture. Let’s clear up some key points:

  • Large companies and vector databases: Large enterprises, for example your typical Fortune500, are unlikely to adopt vector databases for production-scale use without robust Role-Based Access Control (RBAC) support. When such solutions are considered at such enterprises, they are likely to fail vendor evaluation processes due to these fundamental security gaps - limited RBAC support. The idea that a data store containing potentially highly sensitive data would not have robust RBAC makes it a non-starter.
  • Risky Practices in Data Storages: We’ve seen several examples by security vendors, particularly among security startups focused exclusively on access controls for RAG, showcasing the storage of raw sensitive data (e.g. salaries) in vector databases with no RBAC that makes it effectively accessible to any user of the LLM application. This is not an inherent security problem related to the technology (i.e. LLMs or vector databases) but it rather showcases a questionable use of vector databases without basic security controls, suggesting poor engineering decisions and the lack of a thorough security review process.
  • Vector Databases as a Silver Bullet: Many in the industry believe vector databases are a one-size-fits-all solution. However, they overlook that the majority of the world’s data resides in SQL databases. Here are some key points to consider:
    • Vector databases are merely another form of data storage.
    • Using RAG with vector databases is not trivial and requires significant research, particularly in prompt engineering. For example, how to keep ingestion of data up-to-date, retrieve only relevant information in smaller chunks, relevance ranking of a large set of small chunks with similar text.
    • Why not leverage existing infrastructure? SQL databases already dominate the data landscape:
      • Every single Fortune500 company is using SQL databases to store structured data, ranging from CRMs, Employee Management Systems, Payrolls, etc.
      • The rise of text-to-SQL research offers new ways to retrieve data by generating SQL queries using LLMs that are then used to extract data from traditional databases.
      • It’s more effective to focus on targeted data collection in retrieval processes.
      • It’s likely that vector databases will be used alongside SQL databases in many LLM use-cases. In fact, there are already  numerous examples of this integration.
  • Non-Deterministic Access Control: We often get asked or compared to other security vendors with regard to their ability to provide “context-based” or “need-to-know” access controls to the retrieval of data into the LLM from vector databases. In both cases, they introduce non-deterministic controls for access to the vector databases which are based on probabilistic classification models to decide whether a user is permitted to ask certain questions based on their role. This method presents several issues:
    • It adds unnecessary complexity and overhead to the application as you place classification models at the application before retrieval, impacting performance in terms of cost and latency.
    • Non-deterministic probabilistic models create unpredictability in access control, which can lead to inconsistent and unreliable permission management. They are therefore never 100% precise. Large organizations, which demand strict security protocols, cannot afford such an approach. Access control should be deterministic and clear-cut.
    • More critically, some vendors talk about RAG as if it were a standalone solution offered by a vendor lacking RBAC, rather than what it truly is: a technique for retrieving data from a data store to enrich the context of an LLM prompt. There are multiple methods to perform retrieval from various data stores, and the question of access control should hinge on whether the data store itself has adequate security measures in place. By shifting the responsibility of access control entirely to the application layer, some security vendors overlook a crucial principle: the data store must enforce its own access controls to maintain consistent and secure access policies, independent of the application logic.

The future of RAG Security

The path forward for Retrieval-Augmented Generation (RAG) security is clear: implement best practices and reinforce foundational security measures.

  1. RBAC as a Non-Negotiable: It's essential to enforce Role-Based Access Control (RBAC) in your vector database solutions. Without RBAC, sensitive data should never be stored in vector databases. However, non-sensitive data accessible to all employees can be stored safely in vector databases with limited RBAC controls to make an LLM application more relevant to the day activities of employees. For example, a company can store information within a vector database on the SQL table schema, and query templates among other to improve the text-to-SQL generation.

  2. The Value of SQL in RAG: We believe a large untapped potential for large enterprises with RAG lies in the vast data stored in SQL databases, which large enterprises already depend on. SQL security is more deterministic, offering better control and protection:

    1. Text-to-SQL models can be used to interact with the data.
    2. SQL queries can be scanned and assessed for risk as an additional layer of defense beyond the RBAC controls already available.

  3. The Future of Vector Database Security: Security in vector databases will also need to be equally deterministic. The first step is for engineers to choose databases that support extensive RBAC.



Some solutions, like Qdrant, allow users to add "claims," which are metadata or tags linked to user roles or groups. By adding this, you can map LLM users, their permissions, against the claims attached to the data to be retrieved, ensuring that the process of RBAC is both secure and seamless. It is therefore, again critical for engineering teams to adhere to the security standards of their organization while making critical engineering decisions such as what data store they will use for RAG. Similarly, security teams should be diligent at evaluating vector database vendors throughout their procurement processes. While some vector databases are perceived as market leading, it is important to filter through the noise of hype as many of them have  weak RBAC capabilities making their enterprise-readiness questionable, as identified by third-party security research firms.

The future of Vector Database Security will therefore hinge on deterministic controls powered by tighter integrations between LLM Security vendors, Identity Access Management (IAM) or Privileged Access Management (PAM) systems, and vector database providers. By extended RBAC capabilities made available over time, vector databases will be able to support inherent gatekeeping, swiftly detecting and blocking users who attempt unauthorized data retrievals beyond their assigned permissions.

If you’re looking to protect your RAG-enabled application from insecure retrieval, reach out to us. Layer, our LLM runtime security solution can help ensure your data remains safe and secure.