A lightweight Retrieval-Augmented Generation (RAG) system that lets you chat with your own data using local AI models. This implementation uses simple keyword matching for retrieval and connects to Ollama for text generation.
- Load your data: Reads text files and makes them searchable
- Find relevant information: Uses keyword matching to find the most relevant content for your questions
- Generate answers: Connects to a local AI model (via Ollama) to generate natural language responses based on the retrieved information
Make sure you have Python 3.7+ installed:
- Windows: Download from python.org (check "Add Python to PATH" during installation)
- Mac:
brew install python3or download from python.org - Linux:
sudo apt install python3 python3-pip
Install Ollama for local AI model serving:
-
Go to ollama.ai
-
Download and install for your operating system
-
Pull a model (we recommend llama3.2):
ollama pull llama3.2
git clone <your-repo-url>
cd <your-repo-name>uv syncNote: This project only requires the requests library - everything else uses Python's built-in modules
python simple_rag.pyOnce running, you can ask questions like:
- "Who is the CEO of Apple?"
- "What does NVIDIA make?"
- "Which companies are worth over 1 trillion dollars?"
Type quit to exit.
Your Question → Keyword Search → Find Relevant Data → Send to Ollama → Get Answer
- Data Storage: SQLite database stores text chunks and keywords
- Retrieval: Simple keyword matching finds relevant information
- Generation: Ollama processes retrieved context and generates responses
- You ask: "Who runs Microsoft?"
- System finds chunks containing "microsoft" and "ceo"
- Sends Microsoft's information + your question to Ollama
- Ollama responds: "Satya Nadella is the CEO of Microsoft"
- Replace
tech_companies.txtwith your own text file - Format your data with clear sections (the system will auto-detect natural breaks)
- Run the system - it will automatically re-index your new data
- Use clear headings: Company names, product names, etc.
- Include keywords: Make sure important terms appear in your text
- Natural sections: The system splits text into logical chunks automatically
Edit the model name in simple_rag.py:
# In the chat() method, change:
json={"model": "llama3.2", "prompt": prompt, "stream": False}
# To your preferred model:
json={"model": "llama2", "prompt": prompt, "stream": False}Change the number of relevant chunks returned:
# In the search() method, modify n_results:
def search(self, query, n_results=3): # Change 3 to your preferred number- Make sure Ollama is running:
ollama serve - Verify the model is installed:
ollama list - Check if the model name matches what you're using in the code
- Make sure your data file is in the same directory as
simple_rag.py - Check the filename matches exactly (case-sensitive)
- On Windows: Make sure Python is added to PATH during installation
- Try
python3instead ofpython - Verify installation:
python --version
- Make sure your question contains keywords that appear in your data
- Try rephrasing your question with different terms
- Check that your data file contains the information you're asking about
- Local data processing (no external APIs needed for search)
- SQLite database for fast keyword lookup
- Integration with any Ollama model
- Simple, readable codebase
- No complex dependencies
- Built-in:
os,re,json,sqlite3(no installation needed) - External:
requests(for Ollama communication)
CREATE TABLE chunks (
id INTEGER PRIMARY KEY,
content TEXT, -- The actual text content
keywords TEXT -- Searchable keywords extracted from content
);- Indexing: Fast keyword extraction and SQLite storage
- Search: O(n) keyword matching across all chunks
- Memory: Minimal - only active chunks loaded into memory
- Storage: SQLite database file created in project directory
Feel free to submit issues and enhancement requests! This is a learning project, so improvements and educational additions are especially welcome.
This project is open source and available under the MIT License.
Happy chatting with your data! 🤖💬