- In this end to end project I have built a RAG app using ObjectBox Vector Databse and LangChain. RAG techniques allow us to augment a language model's knowledge base actively, ensuring your AI can access and reason with your data and the very latest information. With ObjectBox you can do that, without the data ever needing to leave the device.
- You can check the project live here
- This project showcase the implementation of an advanced RAG system that uses Objectbox vectordatabse and Groq's LLAM3 model as an llm to retrieve information from different PDF documents.
Steps I followed:
- I have used the
PyPdfDirectoryLoader
from thelangchain_community
document loader to load the PDF documents from theus-census-data
directory. - transformed each text into a chunk of
1000
using theRecursiveCharacterTextSplitter
imported from thelangchain.text_splitter
- stored the vector embeddings which were made using the
HuggingFaceBgeEmbeddings
using theObjectBox
vector store. - setup the llm
ChatGroq
with the model nameLlama3-8b-8192
- Setup
ChatPromptTemplate
- Setup
vector_embedding
function to enbedd the documents and store them in theObjectBox
vectorstore - finally created the
document_chain
andretrieval_chain
for chaining llm to prompt andretriever
todocument_chain
respectively
- langchain==0.1.20
- langchain-community==0.0.38
- langchain-core==0.1.52
- langchain-groq==0.1.3
- langchain-objectbox
- python-dotenv==1.0.1
- pypdf==4.2.0
- Prerequisites
- Git
- Command line familiarity
- Clone the Repository:
git clone https://github.com/NebeyouMusie/End-to-End-RAG-Project-using-ObjectBox-and-LangChain.git
- Create and Activate Virtual Environment (Recommended)
python -m venv venv
source venv/bin/activate
- Navigate to the projects directory
cd ./End-to-End-RAG-Project-using-ObjectBox-and-LangChain
using your terminal - Install Libraries:
pip install -r requirements.txt
- Navigate to the app directory
cd ./app
using your terminal - run
streamlit run app.py
- open the link displayed in the terminal on your preferred browser
- As I have already embedded the documents you don't need to click on the
Embedd Documents
button/ But, if it's not working then you need to click on theEmbedd Documents
button and wait until the documnets are processed - Enter your question from the PDFs found in the
us-census-data
directory
- Collaborations are welcomed ❤️
- I would like to thank Krish Naik
- LinkedIn: Nebeyou Musie
- Gmail: nebeyoumusie@gmail.com
- Telegram: Nebeyou Musie