A Retrieval-Augmented Generation (RAG) system for querying and interacting with your document collection.
This project uses LangChain for document processing, ChromaDB for semantic search, and provides a Streamlit web interface for interactive Q&A.
- 📂 Load and query documents (PDF, text, etc.) from a folder.
- 🔎 Semantic search with vector embeddings for accurate retrieval.
- 💬 Chat-style interface for asking questions about your documents.
- ⚡ Supports both OpenAI API and local LLMs (GPT4All or HuggingFace models) — works offline with local models.
- 🖥 Easy to run locally or deploy on Streamlit Cloud.
- Python 3.10+
- LangChain (
langchain,langchain-community,langchain-openai) - ChromaDB (vector store)
- Streamlit (UI)
- OpenAI API or local LLMs (GPT4All / HuggingFace models)
Ensure your network allows connections to OpenAI API if using cloud LLM.
Cache embeddings to speed up repeated runs.
Streamlit Cloud is a convenient deployment option if local network blocks API calls.
MIT License