Repochat is an interactive chatbot project designed to engage in conversations about GitHub repositories using a Large Language Model (LLM). It allows users to have meaningful discussions, ask questions, and retrieve relevant information from a GitHub repository. This README provides step-by-step instructions for setting up and using Repochat on your local machine.
Repochat.mp4
Repochat offers 2 branches with distinct functionalities:
The main branch of Repochat is designed to run entirely on your local machine. This version of Repochat doesn't rely on external API calls and offers greater control over your data. If you're looking for a self-contained solution, the main
branch is the way to go.
The cloud branch of Repochat primarily relies on API calls to external services for model inference and storage. It's well-suited for those who prefer a cloud-based solution and don't want to set up a local environment.
To get started with Repochat, you'll need to follow these installation steps:
-
Create a virtual environment and activate on your local machine to isolate the project's dependencies.
python -m venv repochat-env source repochat-env/bin/activate
-
Clone the Repochat repository and navigate to the project directory.
git clone https://github.com/pnkvalavala/repochat.git cd repochat
-
Install the required Python packages using
pip
.pip install -r requirements.txt
-
Install the "llama-cpp-python" library.
pip install llama-cpp-python
llama.cpp
supports multiple BLAS backends for faster processing.To install with OpenBLAS, set the
LLAMA_BLAS and LLAMA_BLAS_VENDOR
environment variables before installing:CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" pip install llama-cpp-python
To install with cuBLAS, set the
LLAMA_CUBLAS=1
environment variable before installing:CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python
To install with CLBlast, set the
LLAMA_CLBLAST=1
environment variable before installing:CMAKE_ARGS="-DLLAMA_CLBLAST=on" pip install llama-cpp-python
To install with Metal (MPS), set the
LLAMA_METAL=on
environment variable before installing:CMAKE_ARGS="-DLLAMA_METAL=on" pip install llama-cpp-python
To install with hipBLAS / ROCm support for AMD cards, set the
LLAMA_HIPBLAS=on
environment variable before installing:CMAKE_ARGS="-DLLAMA_HIPBLAS=on" pip install llama-cpp-python
To get to know more about Hardware Acceleration, refer to official README from llama-cpp-python
-
Create a folder named
models
in the project directory. -
Download a Language Model from the Hugging Face Model Hub based on your computer's capabilities. It is recommended using the following model as a starting point: TheBloke/CodeLlama-7B-GGUF. If you want to quantize a model available on Hugging Face, follow the instructions from llama.cpp
-
Copy the downloaded model file to the "models" folder.
-
Open the
models.py
file located in the "repochat" folder and set the model file location in thecode_llama()
function as follows:def code_llama(): callbackmanager = CallbackManager([StreamingStdOutCallbackHandler()]) llm = LlamaCpp( model_path="./models/codellama-7b.Q4_K_M.gguf", n_ctx=2048, max_tokens=200, n_gpu_layers=1, f16_kv=True, callback_manager=callbackmanager, verbose=True, use_mlock=True ) return llm
-
Open your terminal and run the following command to start the Repochat application:
streamlit run app.py
-
You can now input the GitHub repository link.
-
Repochat will fetch all the files from the repository and store them in a folder named "cloned_repo." It will then split the files into smaller chunks and calculate their embeddings using the Sentence Transformers model, specifically sentence-transformers/all-mpnet-base-v2.
-
The embeddings are stored locally in a vector database called ChromaDB.
Repochat allows you to engage in conversations with the chatbot. You can ask questions or provide input, and the chatbot will retrieve relevant documents from the vector database. It then sends your input, along with the retrieved documents, to the Language Model for generating responses. By default, I've set the model to "codellama-7b-instruct," but you can change it based on your computer's speed, and you can even try the 13b quantized model for responses.
The chatbot retains memory during the conversation to provide contextually relevant responses.
If you encounter any issues, have suggestions, or want to report a bug, please visit the Issues section of the Repochat repository and create a new issue. Provide detailed information about the problem you're facing, and I'll do my best to assist you.
This project is licensed under the Apache License 2.0. For details, see the LICENSE file. Please note that this is a change from the previous license, and it's important to review the terms and conditions of the new license.