A beautiful, intelligent AI chatbot with Retrieval-Augmented Generation (RAG) capabilities. Upload your documents (PDFs, HTML, code files) and chat with an AI that understands your content, or use it as a general-purpose chatbot without any documents.
-
Dual Mode Operation
- 🟢 RAG Mode: Upload documents and get AI responses based on your content
- 🔵 General Mode: Chat with AI without any documents loaded
-
Multiple File Format Support
- 📄 PDF Documents
- 🌐 HTML Files (.html, .htm)
- 💻 Code Files (.py, .java, .js, .cpp, .c, .h, .cs, .rb, .go, .rs, .php, .swift, .kt, .ts)
- 📝 Text & Markdown (.txt, .md)
-
Beautiful Web Interface
- Modern, responsive design
- File upload with drag-and-drop support
- Real-time chat with typing indicators
- Source citations for RAG responses
-
Powerful Technology Stack
- 🤖 Ollama (SmolLM2) for AI responses
- 🔍 ChromaDB for vector storage
- 📊 Sentence Transformers for embeddings
- 🌐 Flask for web interface
- Python 3.8+ installed on your system
- Ollama installed and running (Download Ollama)
- Pull the SmolLM2 model:
ollama pull smollm2:135m # or for better quality (CLI only): ollama pull smollm2:1.7b
-
Clone or download this repository
-
Install dependencies:
pip install -r requirements.txt
-
Start the application:
python app.py
-
Open your browser: Navigate to
http://localhost:5000
-
Upload Documents (Optional)
- Click the "Upload Files" button
- Select one or more files (PDF, HTML, code files, etc.)
- Wait for processing (creates vector embeddings)
- RAG mode is now active! 🟢
-
Chat with the AI
- Type your question in the input box
- Press Enter to send
- Get responses based on your documents (RAG mode) or general AI knowledge
-
Manage Documents
- Clear Docs: Remove all uploaded documents and return to general mode
- Clear Chat: Clear the conversation history
Run the chatbot in terminal mode:
python chatbot.pyCommands:
- Type your questions naturally
sources- View sources used in the last RAG responsequitorexit- Exit the chatbot
To manually process documents from the data/ folder:
- Place your files in the
data/directory - Run:
python ingest_documents.py
168_rag_chatbot/
├── app.py # Flask web application
├── chatbot.py # CLI chatbot interface
├── ingest_documents.py # Document processing script
├── requirements.txt # Python dependencies
├── data/ # Upload folder for documents
├── chroma_db/ # Vector database storage
├── templates/
│ ├── index.html # Main chat interface
│ └── error.html # Error page
└── static/
├── style.css # Styling
└── script.js # Frontend logic
Edit the configuration variables in app.py, chatbot.py, or ingest_documents.py:
# Model Configuration
OLLAMA_MODEL = "smollm2:135m" # Ollama model to use
EMBEDDING_MODEL = "sentence-transformers/all-MiniLM-L6-v2" # Embeddings model
TOP_K_RESULTS = 4 # Number of context chunks to retrieve
# Paths
DATA_PATH = "data/" # Document upload folder
CHROMA_PATH = "chroma_db" # Vector database location- Learning Assistant: Upload course materials and get answers based on your lessons
- Code Documentation: Upload code files and ask questions about implementation
- Research Helper: Process research papers and get insights
- Knowledge Base: Create a searchable knowledge base from your documents
- General Chat: Use without documents for general AI assistance
- All processing happens locally on your machine
- Documents are stored locally in the
data/folder - No data is sent to external services (except Ollama API calls)
- Make sure Ollama is running:
ollama serve - Verify the model is downloaded:
ollama list
- Install dependencies:
pip install -r requirements.txt
- Check that files are in supported formats
- Ensure files are not corrupted
- Check console output for specific errors
- Clear and recreate: Click "Clear Docs" in the web interface
- Or manually delete the
chroma_dbfolder
To modify or extend the chatbot:
- Change AI Model: Edit
OLLAMA_MODELin the configuration - Add File Types: Extend the loaders in
ingest_documents.py - Customize UI: Modify
templates/index.htmlandstatic/style.css - Adjust RAG Behavior: Change chunk size, overlap, or retrieval count
Contributions are welcome! Feel free to:
- Report bugs
- Suggest new features
- Submit pull requests
This project is open source and available under the MIT License.
- Ollama - Local AI model hosting
- LangChain - Document processing and RAG framework
- ChromaDB - Vector database
- Sentence Transformers - Text embeddings
Made with ❤️ using Python, Flask, and AI for Coding Questions
A Retrieval-Augmented Generation (RAG) chatbot that answers coding questions based on your PDF learning materials using the SmolLM2:1.7b model from Ollama.
- 📚 PDF Processing: Automatically processes PDF documents from the
data/folder - 🔍 Semantic Search: Uses vector embeddings to find relevant content
- 🤖 Local LLM: Powered by SmolLM2:1.7b via Ollama (runs locally)
- 💬 Interactive Chat: Simple command-line interface for asking questions
- 🎯 Context-Aware: Answers based on your specific learning materials
- Python 3.8+ installed
- Ollama installed with SmolLM2:1.7b model
- Install Ollama from: https://ollama.ai
- Pull the model:
ollama pull smollm2:1.7b
# Activate virtual environment
.\master\Scripts\activate
# Install required packages
pip install -r requirements.txtPlace your PDF files in the data/ folder. Currently includes:
learning_java.pdfLearning_Python.pdf
Process the PDFs and create the vector database:
python ingest_documents.pyThis will:
- Load all PDF files from the
data/folder - Split them into manageable chunks
- Create embeddings using sentence-transformers
- Store them in a ChromaDB vector database
python chatbot.py- "How do I create a list in Python?"
- "What is inheritance in Java?"
- "Explain Python decorators"
- "How do I handle exceptions in Java?"
- "What are Python list comprehensions?"
- "Explain Java interfaces"
- Type your question and press Enter
- Type
sourcesto see the source documents for the last answer - Type
quitorexitto end the conversation
-
Document Ingestion (
ingest_documents.py):- Loads PDF files
- Splits into chunks (1000 chars with 200 overlap)
- Creates embeddings using all-MiniLM-L6-v2
- Stores in ChromaDB
-
RAG Pipeline (
chatbot.py):- Takes user question
- Finds relevant chunks using semantic search
- Creates context from top 4 results
- Sends context + question to SmolLM2:1.7b
- Returns contextual answer
rag_chatbot/
├── data/ # PDF documents
│ ├── learning_java.pdf
│ └── Learning_Python.pdf
├── chroma_db/ # Vector database (created after ingestion)
├── master/ # Virtual environment
├── ingest_documents.py # Document processing script
├── chatbot.py # Main chatbot application
├── requirements.txt # Python dependencies
└── README.md # This file
Run python ingest_documents.py first to create the vector database.
- Make sure Ollama is running
- Verify the model is installed:
ollama list - Pull the model if needed:
ollama pull smollm2:1.7b
- Reduce
TOP_K_RESULTSinchatbot.py(default: 4) - Reduce
chunk_sizeiningest_documents.py(default: 1000)
Edit OLLAMA_MODEL in chatbot.py:
OLLAMA_MODEL = "your-model-name"Edit chunk_size in ingest_documents.py:
chunk_size=1000, # Increase or decrease
chunk_overlap=200 # Adjust overlapEdit TOP_K_RESULTS in chatbot.py:
TOP_K_RESULTS = 4 # Increase for more context- Add PDF, text, or code files to the
data/folder - Re-run the ingestion:
python ingest_documents.py - The chatbot will now include the new materials
MIT License - Feel free to use and modify as needed!