Production-grade Multi-Retrieval RAG System with Explainable Results
A modern Retrieval-Augmented Generation (RAG) system demonstrating best practices for 2026. Built with semantic search, keyword search, hybrid ranking, comprehensive evaluation metrics, and full explainability.
Most RAG systems are either:
- Too simple (single retrieval strategy, no evaluation)
- Too complex (production systems with business logic mixed in)
PolyRAG is the Goldilocks solution: production-quality architecture with educational clarity.
β¨ Multi-Retrieval: Semantic (vector) + Keyword (BM25) + Hybrid (RRF + weighted fusion)
π Explainable: Shows which retriever found each chunk, with scores and ranking reasons
π Evaluation First: Built-in metrics (Recall@K, Precision@K, MRR, MAP, F1)
ποΈ Production Ready: Type-safe (Pydantic v2), swappable components, config-driven
π¨ Beautiful UI: Modern Next.js frontend with real-time metrics and charts
π¦ Docker Ready: Full docker-compose setup with Postgres
PolyRAG/
βββ backend/ # Python FastAPI Backend
β βββ api/ # API endpoints (ingestion, query, eval, system)
β βββ retrieval/ # Retrieval strategies
β β βββ semantic.py # Vector search (FAISS/ChromaDB)
β β βββ keyword.py # BM25 implementation
β β βββ hybrid.py # RRF + weighted fusion
β βββ ranking/ # Score fusion and reranking
β βββ evaluation/ # RAG metrics implementation
β βββ ingestion/ # Document chunking & processing
β βββ embeddings/ # OpenAI + local embeddings
β βββ vectorstore/ # Vector DB abstraction
β βββ models/ # Pydantic v2 schemas
β βββ db/ # SQLAlchemy models
β βββ config.py # Configuration management
βββ frontend/ # Next.js 16 Frontend
β βββ app/
β β βββ page.tsx # Landing page
β β βββ ingest/ # Document ingestion UI
β β βββ query/ # Query interface with explainability
β β βββ evaluation/ # Metrics dashboard with charts
β βββ components/ui/ # shadcn/ui components
β βββ lib/api.ts # API client
βββ docker-compose.yml # Full stack deployment
# Clone the repository
git clone https://github.com/yourusername/polyRAG.git
cd polyRAG
# Set up environment variables
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY (optional - can use local embeddings)
# Start all services
docker-compose up -d
# Backend: http://localhost:8000
# Frontend: http://localhost:3000
# API Docs: http://localhost:8000/docs# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Set up environment
cp .env.example .env
# Edit .env with your configuration
# Initialize database
# The app will auto-create tables on first run
# Run the API
uvicorn backend.main:app --reload --host 0.0.0.0 --port 8000cd frontend
# Install dependencies
npm install
# Set environment variable
echo "NEXT_PUBLIC_API_URL=http://localhost:8000/api" > .env.local
# Run development server
npm run devAccess the application:
- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
- API Documentation: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
# Via API
curl -X POST "http://localhost:8000/api/ingest/" \
-H "Content-Type: application/json" \
-d '{
"title": "Introduction to RAG",
"content": "Your document content here...",
"source": "manual_upload"
}'
# Or use the frontend at http://localhost:3000/ingestimport requests
response = requests.post("http://localhost:8000/api/query/", json={
"query": "How does hybrid retrieval work?",
"retrieval_mode": "hybrid",
"top_k": 5
})
results = response.json()
for chunk in results["results"]:
print(f"Score: {chunk['final_score']}")
print(f"Content: {chunk['content']}")
print(f"Sources: {chunk['sources']}")# Prepare evaluation dataset
eval_dataset = [
{
"query": "What is RAG?",
"relevant_chunks": ["chunk_id_1", "chunk_id_3"]
}
]
# Run evaluation
response = requests.post("http://localhost:8000/api/eval/run", json={
"evaluation_dataset": eval_dataset,
"retrieval_mode": "hybrid",
"top_k": 5
})
metrics = response.json()["metrics"]
print(f"Recall@5: {metrics['recall_at_k']}")
print(f"Precision@5: {metrics['precision_at_k']}")
print(f"MRR: {metrics['mrr']}")- Uses transformer-based embeddings (e.g.,
all-MiniLM-L6-v2) - Converts text to dense vectors (384 or 768 dimensions)
- Finds similar documents via cosine similarity
- Best for: Conceptual similarity, paraphrased queries
- Classic information retrieval algorithm
- TF-IDF based with length normalization
- Inverted index for efficient lookup
- Best for: Exact term matches, specific phrases
- Reciprocal Rank Fusion (RRF): Combines rankings from multiple retrievers
- Weighted Score Fusion: Configurable weights for semantic vs keyword
- Deduplication: Merges overlapping results
- Best for: Most robust retrieval across diverse queries
| Metric | Description | Formula |
|---|---|---|
| Recall@K | % of relevant docs in top-K results | relevant_in_K / total_relevant |
| Precision@K | % of top-K results that are relevant | relevant_in_K / K |
| MRR | Mean Reciprocal Rank of first relevant result | avg(1 / rank_first_relevant) |
| MAP | Mean Average Precision across all queries | avg(AP_per_query) |
| F1 Score | Harmonic mean of precision and recall | 2 * (P * R) / (P + R) |
- System statistics (documents, chunks, embeddings)
- Quick access to all features
- Beautiful gradient UI with dark mode support
- Upload documents with automatic chunking
- Real-time processing status
- View all ingested documents
- Sample data loader for testing
- Select retrieval strategy (semantic/keyword/hybrid)
- Configurable top-K
- Explainability: See which retriever found each result
- Score breakdown and ranking explanation
- Sample queries for quick testing
- Run evaluations with custom datasets
- Visualization with recharts:
- Bar charts for metric comparison
- Radar charts for multi-dimensional view
- Per-query detailed results
- Evaluation history tracking
- Strategy comparison
# Database
DATABASE_URL=postgresql://user:pass@localhost:5432/polyrag
# Embeddings
EMBEDDING_PROVIDER=local # or 'openai'
EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
OPENAI_API_KEY=sk-... # Only needed if using OpenAI
# Vector Store
VECTOR_STORE_TYPE=faiss # or 'chromadb'
# Retrieval
DEFAULT_TOP_K=5
SEMANTIC_WEIGHT=0.7
KEYWORD_WEIGHT=0.3
# API
API_HOST=0.0.0.0
API_PORT=8000
LOG_LEVEL=INFOEmbedding Models:
# backend/config.py
settings.embedding_provider = "openai" # or "local"
settings.embedding_model = "text-embedding-3-small"Vector Stores:
# backend/config.py
settings.vector_store_type = "chromadb" # or "faiss"# Backend tests
pytest backend/tests/
# Frontend tests
cd frontend && npm test
# End-to-end tests
pytest e2e/- Ingestion: ~500 chars/chunk, ~10 chunks/sec
- Embedding: Batched processing (32 chunks/batch)
- Retrieval: <100ms for top-10 results (semantic + keyword + fusion)
- Vector Search: <50ms on 10K documents (FAISS)
- Context Graph (parent-child relationships)
- Advanced reranking (cross-encoder models)
- Multi-modal support (images, tables)
- Query expansion and reformulation
- Streaming responses
- A/B testing framework
- Production deployment guides (K8s, AWS)
This is a portfolio/educational project, but contributions are welcome!
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'feat: add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project was built commit-by-commit to demonstrate clear development progression:
- Project scaffolding
- Database models and schemas
- Document ingestion pipeline
- Embeddings service
- Semantic retriever
- Keyword retriever (BM25)
- Hybrid ranking
- FastAPI endpoints
- Evaluation engine
- Example usage
- Frontend UI (complete)
- Docker setup and documentation
See git log for full commit history with detailed messages.
MIT License - See LICENSE file for details
Built as a portfolio project demonstrating production-grade RAG system architecture.
β If this project helped you understand RAG systems better, consider starring it!