🧑‍💻 Codebase Chatbot with Retrieval-Augmented Generation (RAG)

Website URL: https://codebase-rag-sj.streamlit.app

An AI-powered chatbot that allows users to interact with and understand codebases by leveraging Retrieval-Augmented Generation (RAG). This tool embeds the content of code repositories, stores them in a vector database, and uses Large Language Models (LLMs) to answer queries contextually.

🚀 Features

Chat with a Codebase: Understand the structure, purpose, and potential improvements of any codebase.
Preloaded Repositories: Seamlessly switch between preloaded repositories to explore different projects.
Accurate Contextual Answers: Powered by LLMs, providing insights based on embedded code content.
Future Enhancements: Plan to allow dynamic uploads of any GitHub repository for embedding and querying.

🛠️ Tech Stack

Python: Core language for implementation.
Hugging Face Transformers: Used for generating embeddings with the sentence-transformers/all-mpnet-base-v2 model.
Pinecone: A vector database to store and retrieve code embeddings.
Streamlit: Frontend framework to provide an interactive and user-friendly UI.
OpenAI LLMs: For generating accurate, context-aware responses.

⚙️ How it Works

Create Vector Embeddings: Using Hugging Face model, create embeddings of relevant information like function definitions, comments, and documentation from the codebase.
Store the embeddings: Used Pinecone Vector Database
Query the Codebase: When a query is made, relevant pieces of the codebase are retrieved from Pinecone and augmented with the query before being sent to the LLM for a response.
Interactive Chat: Users can ask questions through a Streamlit-based UI, select from preloaded repositories, and receive responses in real time.

📋 Setup Instructions

1. Clone the Repository

git clone https://github.com/Sruthij93/Codebase-RAG
cd codebase-rag

2. Install Dependencies

pip install -r requirements.txt

3. Set Up Pinecone

Create a Pinecone account at Pinecone.io.
Get your Pinecone API key and index name.
Configure your .env file with Pinecone credentials.

4. Run the Application

streamlit run app.py

🌟 Next Steps

Allow users to upload custom GitHub repositories for embedding.
Enhance the UI for better interactivity.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request or open an Issue for discussion.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.devcontainer		.devcontainer
Movie-Recommendation		Movie-Recommendation
SecureAgent		SecureAgent
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
SJ_Codebase_RAG_Tutorial.ipynb		SJ_Codebase_RAG_Tutorial.ipynb
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧑‍💻 Codebase Chatbot with Retrieval-Augmented Generation (RAG)

Website URL: https://codebase-rag-sj.streamlit.app

🚀 Features

🛠️ Tech Stack

⚙️ How it Works

📋 Setup Instructions

1. Clone the Repository

2. Install Dependencies

3. Set Up Pinecone

4. Run the Application

🌟 Next Steps

🤝 Contributing

About

Releases

Packages

Languages

Sruthij93/Codebase-RAG

Folders and files

Latest commit

History

Repository files navigation

🧑‍💻 Codebase Chatbot with Retrieval-Augmented Generation (RAG)

Website URL: https://codebase-rag-sj.streamlit.app

🚀 Features

🛠️ Tech Stack

⚙️ How it Works

📋 Setup Instructions

1. Clone the Repository

2. Install Dependencies

3. Set Up Pinecone

4. Run the Application

🌟 Next Steps

🤝 Contributing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages