RAG Components – 10,000 ft Level

When you look at GEN AI and specifically LLM’s from a usage point of view, we have a few techniques to interact with LLM’s.

This depends on whether you need to interact with external data or just use LLMs for your different tasks.

RAG is an important technique and widely adopted because it helps you interact with a wider variety of external data without having to fine-tune your GEN AI model and also helps reduce hallucinations and helps ground LLM responses.

If you need to look at the basics of RAG, look at the below article:

RAG_BASICS

While it is easy to get a high-level overview of RAG, as we design and build production rag systems it is important to understand and identify common components.

To do this, lets understand RAG from a user flow.

USER RAG FLOW:

Scenario: The user had many documents, and texts that they wanted to use LLM to query the documents and generate responses

STAGE 1: DOCUMENT UPLOAD

STAGE 2: LLM INTERACTION

BEHIND THE SCENES:

Now let’s deep dive and look at what happens internally

STAGE 1: DOCUMENT UPLOAD

Document broken into chunks -> Embedding models used to get vector representation of chunks-> chunks stored in the vector DB

STAGE 2: INTERACTION WITH LLM

User enter query-> Create embedding for the user query-> Vector similarity search for query-> Apply reranking algorithm to fetch appropriate chunks-> Use results to add context to existing prompt -> Send the request to LLM-> Get response

Optional steps:

In some cases, we might need to maintain conversational context(use conversational cache) and in some cases, we might need to just have to cache for optimization(use semantic cache)

What else:

We need to log and evaluate the LLM responses. We also need to build guardrails to account for profanity and hallucination
Orchestrator if you need to run the LLM for batch use cases

COMPONENTS OF RAG SYSTEM:

Now that we have understood the user flow and why we use different components. Lets sum it all up and draw a component diagram and let’s take look at example existing tool/software to implement each of the identified components.

Note:

All components are not necessary for every rag system. This depends on use case to use case
There is no right tool or model for all use cases. Each tool and model needs research before usage for enterprise.
While the idea of this article is to establish common components that I have seen in RAG, this might change or new components could get added as we are finding better ways to do things.

RAG Components – 10,000 ft Level

USER RAG FLOW:

BEHIND THE SCENES:

COMPONENTS OF RAG SYSTEM:

Published by rohan ganesh

One thought on “RAG Components – 10,000 ft Level”

Leave a comment Cancel reply

USER RAG FLOW:

BEHIND THE SCENES:

COMPONENTS OF RAG SYSTEM:

Share this:

Related

Published by rohan ganesh

One thought on “RAG Components – 10,000 ft Level”

Leave a comment Cancel reply