Skip to content

Simple Chat interface to implement RAG using a local Vectordb, documents and GPT4AllEmbeddings

Notifications You must be signed in to change notification settings

adekoyadapo/llm-rag-local

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG Chat Assistant

This Python code implements a chat assistant using the RAG (Retrieval-Augmented Generation) architecture. It leverages the Langchain Community libraries for various functionalities, including vector stores, chat models, embeddings, document loaders, and more.

Setup

Dependencies

  • conda or pyenv - Manage python virtual environment (not necessary but recommended localization)
  • python

Conda Environment

Create and activate a Conda environment:

conda create -p .venv python=3.11
conda activate .venv
conda install pip

Install Dependencies

Install the required dependencies using Conda:

pip install -r requirements.txt

Usage

  1. Download Model from Huggingface llama to a directory locally and Set up your model directory by providing the model name with the directory in the model_path.

    def load_model(
        model_path="model/llama-2-7b-chat.ggmlv3.q8_0.bin",
        model_type="llama",
        max_new_tokens=512,
        content_length=400,
        temperature=0.9,
    ):
  2. Specify the persistence directory for the vector database:

    persist_directory = "./db/rag"  # Provide the path for the persistence directory
  3. Run the code.

    streamlit run main.py

The assistant will start, and if the vector database is not present, it will be created using the provided documents in the ./docs/ directory.

Chat Interface

The assistant provides a chat interface using Streamlit. Users can interact with the assistant by typing messages in the input box. The assistant processes the queries using the RAG architecture and responds accordingly.

Dependencies

  • torch
  • streamlit
  • ollama
  • Hugginface
  • langchain_community

Notes

  • The code supports both PDF and text documents for creating the vector database.
  • The assistant's responses are displayed with a simulated typing effect.

Feel free to customize the code according to your specific use case and Ollama model. For more information, refer to the official documentation of Langchain Community.

About

Simple Chat interface to implement RAG using a local Vectordb, documents and GPT4AllEmbeddings

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages