rag_time

Simple Chat UI to chat about a private codebase using LLMs locally.

Technology:

Ollama and llama3:8b as Large Language Model
jina-embeddings-v2-base-code as Embedding Model
LangChain as a framework for LLM
Chainlit for the chat UI

Getting started

Prerequisites

Make sure you have Python 3.9 or later installed
Download and install Ollama
Pull the model:
```
ollama pull llama3:8b
```
Run the model:
```
ollama run llama3:8b
```

Run the Chat bot

Create a Python virtual environment and activate it:

python3 -m venv .venv-rag-time && source .venv-rag-time/bin/activate

Install Python dependencies:
```
pip install -r requirements.txt
```
Clone an example repository to question the chat bot about:
```
git clone https://github.com/discourse/discourse
```
Set up the vector database:
```
python ingest-code.py
```
Start the chat bot:
```
chainlit run main.py
```
To exit the Python virtual environment after you are done, run:
```
deactivate
```

Make it your own

Modify the .env file to run the chat bot on your codebase and language.

Ask Questions

"The file "....ex" is missing a module comment. Can you create one that helps new team members understand what it does, and how it works?"

"Given the following Mission, can you please explain what files I should change, and how I can implement the changes?"

Enhancements

Use the advanced script to process the codebase

The script ingest-code.py is intended to be easy to understand and to modify. For more control over the codebase processing, you can use the process_codebase.py script.

usage: process_codebase.py [-h] [-c] [-cd CHUNKS_DIR] [-db CHROMA_DB_DIR] [-oh] [-ed] base_directory

Process subdirectories for chunking and ingestion.

positional arguments:
  base_directory        Base directory to process

options:
  -h, --help            show this help message and exit
  -c, --clean           Clean existing chunks and chroma db before processing
  -cd CHUNKS_DIR, --chunks_dir CHUNKS_DIR
                        If given, chunks are stored into this directory.
  -db CHROMA_DB_DIR, --chroma_db_dir CHROMA_DB_DIR
                        Directory for Chroma DB (default: .rag_time/chroma_db)
  -oh, --omit-headers   Do not add filename header in chunks
  -ed, --empty_db       Only create an empty chroma db

To do achieve the same result as ingest-code.py, you can run:

python process_codebase.py -c -db .rag_time/chroma_db ./discourse

To use this in chainlit, you can set the CODEBASE_PATH environment variable to the directory you want to process:

export VECTOR_DB_PATH=".rag_time/chroma_db" ; chainlit run main.py

The switches -c, -ed and -oh are useful to evaluate the impact of different processing options.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
script		script
.env		.env
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
chainlit.md		chainlit.md
count_chroma_records.py		count_chroma_records.py
helpers.py		helpers.py
ingest-code.py		ingest-code.py
main.py		main.py
process_codebase.py		process_codebase.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rag_time

Getting started

Prerequisites

Run the Chat bot

Make it your own

Ask Questions

Enhancements

Use the advanced script to process the codebase

About

Contributors 3

Languages

License

bitcrowd/rag_time

Folders and files

Latest commit

History

Repository files navigation

rag_time

Getting started

Prerequisites

Run the Chat bot

Make it your own

Ask Questions

Enhancements

Use the advanced script to process the codebase

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 3

Languages