When you look at GEN AI and specifically LLM’s from a usage point of view, we have a few techniques to interact with LLM’s.
This depends on whether you need to interact with external data or just use LLMs for your different tasks.

RAG is an important technique and widely adopted because it helps you interact with a wider variety of external data without having to fine-tune your GEN AI model and also helps reduce hallucinations and helps ground LLM responses.
If you need to look at the basics of RAG, look at the below article:
While it is easy to get a high-level overview of RAG, as we design and build production rag systems it is important to understand and identify common components.
To do this, lets understand RAG from a user flow.
USER RAG FLOW:
Scenario: The user had many documents, and texts that they wanted to use LLM to query the documents and generate responses
STAGE 1: DOCUMENT UPLOAD

STAGE 2: LLM INTERACTION

BEHIND THE SCENES:
Now let’s deep dive and look at what happens internally
STAGE 1: DOCUMENT UPLOAD

Document broken into chunks -> Embedding models used to get vector representation of chunks-> chunks stored in the vector DB
STAGE 2: INTERACTION WITH LLM

- User enter query-> Create embedding for the user query-> Vector similarity search for query-> Apply reranking algorithm to fetch appropriate chunks-> Use results to add context to existing prompt -> Send the request to LLM-> Get response
Optional steps:
- In some cases, we might need to maintain conversational context(use conversational cache) and in some cases, we might need to just have to cache for optimization(use semantic cache)
What else:
- We need to log and evaluate the LLM responses. We also need to build guardrails to account for profanity and hallucination
- Orchestrator if you need to run the LLM for batch use cases
COMPONENTS OF RAG SYSTEM:
Now that we have understood the user flow and why we use different components. Lets sum it all up and draw a component diagram and let’s take look at example existing tool/software to implement each of the identified components.



Note:
- All components are not necessary for every rag system. This depends on use case to use case
- There is no right tool or model for all use cases. Each tool and model needs research before usage for enterprise.
- While the idea of this article is to establish common components that I have seen in RAG, this might change or new components could get added as we are finding better ways to do things.
One thought on “RAG Components – 10,000 ft Level”