Retrieval Augmented Generation (RAG): How to Get Large Language Models Learn Your Data & Give You Answers
With the growing Large Language Models, there is a higher demand for people wanting to get answers from their data sources. RAG enables this.
Table of contents
With the growing AI and Large Language Models, there is a higher demand for people wanting to get answers from their data sources. Prompting them to integrate various methods to search and retrieve data from different sources and passing them to these AI Models has become critical. Retrieval Augmented Generation, or RAG, is a potential solution for this.
AI and Generating Human-Like Responses.
Tracing back to 2013, the AI landscape was vastly different. The buzz was about simple neural networks, the foundational building blocks that paved the way for future advancements. People were happy if they could quickly train and get good scores on MNIST.
Later, we got deep neural networks capable of performing tasks in areas like image and speech recognition, outperforming traditional algorithms.
But the real change came in November 2022. When OpenAI launched ChatGPT. It took the world by storm. Amassing 100 Million+ users in just 2 months, this has been the most significant moment in internet and AI history.
Not just because it can perform multiple tasks but because its capability surprised many people. It could understand human-like responses and answer a lot of questions in the same manner. Which earlier wasn't possible.
And after that, large language models have taken the world by storm. We're seeing the rise of AI startups and enterprises developing their own large language models. Companies adopting this technology in their own workspaces. There is a lot of debate on how to fine-tune large language models to improve performance. Generating summaries and what not.
The question becomes, can anyone use their own data, pass it to the LLM, and get it to generate insights on the fly?
Yes, anyone can use their data. PDFs, multiple code files, answers from databases and more. And get answers from your Generative AI tool. But this creates two diverse approaches to this problem. Fine-tuning and Retrieval Augmented Generation.
Retrieval Augmented Generation aka RAG โจ
Retrieval Augmented Generation (RAG) is a recent advancement in Artificial Intelligence. It's a form of Open Domain Question Answering with a Retriever and a Generative AI Model. It combines a search system with AI models like ChatGPT, Llama2, etc. With RAG, the system searches a vast knowledge base for up-to-date data and articles. This data is then used by the AI to give precise answers. This method helps reduce errors in AI responses and offers more customized solutions.
So, with RAG, the retriever or searcher can access the latest data, sources, and other important articles from a very large knowledge base. And then it provides it as input to the Generative AI Model. Hence the name Retriever Augmented Generation. This approach allows the Large Language Model to tap into a vast knowledge base and provide relevant and to-the-point information.
This significantly improves the problem of hallucination faced by large language models. And can provide tailored answers for you.
Why Use Retrieval Augmented Generation? Aren't Current AI Models Enough?
Current Generation AI Models have a cutoff period after which they stop training. Due to that, I asked for events that happened after that. Attempting to retrieve recent information can be a challenge.
Take a look at this example. Asking ChatGPT about BUN
ChatGPT Recommends:
I would recommend checking the official documentation or repositories, tech news websites, or relevant community forums.
Can we not send the text of these documents, repositories, websites, and community forums directly into ChatGPT and get the relevant answers? And RAG helps here.
And not just about BUN. But what if, in the same manner, you wanted to query your company's data and know insights and answers relevant to your own data without making it public.
And this is where Retrieval Augmented Generation shines and provides answers with sources. And, while on the question of connecting multiple data sources. Swirl can help you solve the problem quickly.
Connect to various data sources.
Swirl can perform query processing via ChatGPT.
Re-ranking and getting the top-N answers via spaCy's Large Language Model. (Cosine Relevancy Processor)
RAG vs. Fine-Tuning
What is Fine Tuning?
Fine-tuning a Large Language Model means retraining any large language model on a dataset and making it really good for a subtask. People have fine tune LLaMa2 LLM for various tasks like writing SQL, Python Code, etc. ref
While this is good for tasks with massive and static data like Python syntax, SQL, etc... The problem comes when you want to train it on something new or when no large dataset is available.
If the dataset keeps changing, you must retrain the model to keep up with the changes. And this is expensive.
Consider coding it on documentation of BUN, Astro, Swirl, or your company's documents. Also, note fine-tuning makes it good at a specific task. It may be that you won't be able to access the source or get the relevant citations for that source.
Can you do fine-tuning?
Answer these questions:
Do you have the engineers and hardware required for training a Large Language Model?
Do you have the data necessary to get good answers from the Large Language Model?
Do you have Time?
If the answer to any of these three questions is "no." Then, you need to reconsider fine-tuning. And opt-in for a better and more accessible alternative.
RAG Fits the scenarios where Fine Tuning Doesn't.
Small documentations.
Articles, research papers, blogs.
Newly created code bases, etc.
Generating answers from them is easier than you think. While there are many options with which you can create a RAG Pipeline. But Swirl makes both the parts, Retrieval and Generation, easier.
Swirl can search and provide the top-N best answers from the search query and software models. Check our GitHub.
Contribute to Swirl ๐
Swirl is an open-source library in Python ๐. And we're looking for people to help build this software. Looking for fantastic people who can:
Create excellent articles, enhance our readme, UI, etc.
Contribute by adding a connector or search provider.
Join our community on Slack.
It would mean a lot if you could give us a ๐ on GitHub. Keeps the team motivated. ๐ฅ