PolyRAG 🚀

Production-grade Multi-Retrieval RAG System with Explainable Results

A modern Retrieval-Augmented Generation (RAG) system demonstrating best practices for 2026. Built with semantic search, keyword search, hybrid ranking, comprehensive evaluation metrics, and full explainability.

🎯 Why PolyRAG?

Most RAG systems are either:

Too simple (single retrieval strategy, no evaluation)
Too complex (production systems with business logic mixed in)

PolyRAG is the Goldilocks solution: production-quality architecture with educational clarity.

Key Differentiators

✨ Multi-Retrieval: Semantic (vector) + Keyword (BM25) + Hybrid (RRF + weighted fusion)
🔍 Explainable: Shows which retriever found each chunk, with scores and ranking reasons
📊 Evaluation First: Built-in metrics (Recall@K, Precision@K, MRR, MAP, F1)
🏗️ Production Ready: Type-safe (Pydantic v2), swappable components, config-driven
🎨 Beautiful UI: Modern Next.js frontend with real-time metrics and charts
📦 Docker Ready: Full docker-compose setup with Postgres

🏗️ Architecture

PolyRAG/
├── backend/                 # Python FastAPI Backend
│   ├── api/                # API endpoints (ingestion, query, eval, system)
│   ├── retrieval/          # Retrieval strategies
│   │   ├── semantic.py     # Vector search (FAISS/ChromaDB)
│   │   ├── keyword.py      # BM25 implementation
│   │   └── hybrid.py       # RRF + weighted fusion
│   ├── ranking/            # Score fusion and reranking
│   ├── evaluation/         # RAG metrics implementation
│   ├── ingestion/          # Document chunking & processing
│   ├── embeddings/         # OpenAI + local embeddings
│   ├── vectorstore/        # Vector DB abstraction
│   ├── models/             # Pydantic v2 schemas
│   ├── db/                 # SQLAlchemy models
│   └── config.py           # Configuration management
├── frontend/               # Next.js 16 Frontend
│   ├── app/
│   │   ├── page.tsx        # Landing page
│   │   ├── ingest/         # Document ingestion UI
│   │   ├── query/          # Query interface with explainability
│   │   └── evaluation/     # Metrics dashboard with charts
│   ├── components/ui/      # shadcn/ui components
│   └── lib/api.ts          # API client
└── docker-compose.yml      # Full stack deployment

🚀 Quick Start

Option 1: Docker (Recommended)

# Clone the repository
git clone https://github.com/yourusername/polyRAG.git
cd polyRAG

# Set up environment variables
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY (optional - can use local embeddings)

# Start all services
docker-compose up -d

# Backend: http://localhost:8000
# Frontend: http://localhost:3000
# API Docs: http://localhost:8000/docs

Option 2: Local Development

Backend

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Set up environment
cp .env.example .env
# Edit .env with your configuration

# Initialize database
# The app will auto-create tables on first run

# Run the API
uvicorn backend.main:app --reload --host 0.0.0.0 --port 8000

Frontend

cd frontend

# Install dependencies
npm install

# Set environment variable
echo "NEXT_PUBLIC_API_URL=http://localhost:8000/api" > .env.local

# Run development server
npm run dev

Access the application:

Frontend: http://localhost:3000
Backend API: http://localhost:8000
API Documentation: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc

📖 Usage Guide

1. Ingest Documents

# Via API
curl -X POST "http://localhost:8000/api/ingest/" \
  -H "Content-Type: application/json" \
  -d '{
    "title": "Introduction to RAG",
    "content": "Your document content here...",
    "source": "manual_upload"
  }'

# Or use the frontend at http://localhost:3000/ingest

2. Query the System

import requests

response = requests.post("http://localhost:8000/api/query/", json={
    "query": "How does hybrid retrieval work?",
    "retrieval_mode": "hybrid",
    "top_k": 5
})

results = response.json()
for chunk in results["results"]:
    print(f"Score: {chunk['final_score']}")
    print(f"Content: {chunk['content']}")
    print(f"Sources: {chunk['sources']}")

3. Evaluate Performance

# Prepare evaluation dataset
eval_dataset = [
    {
        "query": "What is RAG?",
        "relevant_chunks": ["chunk_id_1", "chunk_id_3"]
    }
]

# Run evaluation
response = requests.post("http://localhost:8000/api/eval/run", json={
    "evaluation_dataset": eval_dataset,
    "retrieval_mode": "hybrid",
    "top_k": 5
})

metrics = response.json()["metrics"]
print(f"Recall@5: {metrics['recall_at_k']}")
print(f"Precision@5: {metrics['precision_at_k']}")
print(f"MRR: {metrics['mrr']}")

📊 Retrieval Strategies Explained

1. Semantic Search

Uses transformer-based embeddings (e.g., all-MiniLM-L6-v2)
Converts text to dense vectors (384 or 768 dimensions)
Finds similar documents via cosine similarity
Best for: Conceptual similarity, paraphrased queries

2. Keyword Search (BM25)

Classic information retrieval algorithm
TF-IDF based with length normalization
Inverted index for efficient lookup
Best for: Exact term matches, specific phrases

3. Hybrid Ranking

Reciprocal Rank Fusion (RRF): Combines rankings from multiple retrievers
Weighted Score Fusion: Configurable weights for semantic vs keyword
Deduplication: Merges overlapping results
Best for: Most robust retrieval across diverse queries

🧪 Evaluation Metrics

Metric	Description	Formula
Recall@K	% of relevant docs in top-K results	relevant_in_K / total_relevant
Precision@K	% of top-K results that are relevant	relevant_in_K / K
MRR	Mean Reciprocal Rank of first relevant result	avg(1 / rank_first_relevant)
MAP	Mean Average Precision across all queries	avg(AP_per_query)
F1 Score	Harmonic mean of precision and recall	2 * (P * R) / (P + R)

🎨 Frontend Features

Dashboard Overview

System statistics (documents, chunks, embeddings)
Quick access to all features
Beautiful gradient UI with dark mode support

Ingestion Page

Upload documents with automatic chunking
Real-time processing status
View all ingested documents
Sample data loader for testing

Query Page

Select retrieval strategy (semantic/keyword/hybrid)
Configurable top-K
Explainability: See which retriever found each result
Score breakdown and ranking explanation
Sample queries for quick testing

Evaluation Dashboard

Run evaluations with custom datasets
Visualization with recharts:
- Bar charts for metric comparison
- Radar charts for multi-dimensional view
- Per-query detailed results
Evaluation history tracking
Strategy comparison

🔧 Configuration

Environment Variables

# Database
DATABASE_URL=postgresql://user:pass@localhost:5432/polyrag

# Embeddings
EMBEDDING_PROVIDER=local  # or 'openai'
EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
OPENAI_API_KEY=sk-...  # Only needed if using OpenAI

# Vector Store
VECTOR_STORE_TYPE=faiss  # or 'chromadb'

# Retrieval
DEFAULT_TOP_K=5
SEMANTIC_WEIGHT=0.7
KEYWORD_WEIGHT=0.3

# API
API_HOST=0.0.0.0
API_PORT=8000
LOG_LEVEL=INFO

Swappable Components

Embedding Models:

# backend/config.py
settings.embedding_provider = "openai"  # or "local"
settings.embedding_model = "text-embedding-3-small"

Vector Stores:

# backend/config.py
settings.vector_store_type = "chromadb"  # or "faiss"

🧪 Testing

# Backend tests
pytest backend/tests/

# Frontend tests
cd frontend && npm test

# End-to-end tests
pytest e2e/

📈 Performance

Ingestion: ~500 chars/chunk, ~10 chunks/sec
Embedding: Batched processing (32 chunks/batch)
Retrieval: <100ms for top-10 results (semantic + keyword + fusion)
Vector Search: <50ms on 10K documents (FAISS)

🗺️ Roadmap

Context Graph (parent-child relationships)
Advanced reranking (cross-encoder models)
Multi-modal support (images, tables)
Query expansion and reformulation
Streaming responses
A/B testing framework
Production deployment guides (K8s, AWS)

🤝 Contributing

This is a portfolio/educational project, but contributions are welcome!

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'feat: add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📚 Learn More

RAG Resources

Tech Stack Docs

🎓 Commit History

This project was built commit-by-commit to demonstrate clear development progression:

Project scaffolding
Database models and schemas
Document ingestion pipeline
Embeddings service
Semantic retriever
Keyword retriever (BM25)
Hybrid ranking
FastAPI endpoints
Evaluation engine
Example usage
Frontend UI (complete)
Docker setup and documentation

See git log for full commit history with detailed messages.

📄 License

MIT License - See LICENSE file for details

👤 Author

Built as a portfolio project demonstrating production-grade RAG system architecture.

⭐ If this project helped you understand RAG systems better, consider starring it!

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
backend		backend
frontend		frontend
.dockerignore		.dockerignore
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
Dockerfile.backend		Dockerfile.backend
Dockerfile.frontend		Dockerfile.frontend
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

KshitijBhardwaj18/PolyRAG

Folders and files

Latest commit

History

Repository files navigation