📄 Document AI Q&A — RAG System

A Retrieval-Augmented Generation (RAG) system that answers natural language questions about uploaded PDF documents, strictly grounded in document content. No hallucinations.

Built with LlamaIndex, LlamaParse, Groq, and Neon PostgreSQL (pgvector).

🏗 Architecture

PDF Upload → LlamaParse (Cloud Markdown extraction)
           → Metadata tagging (document_name injected per chunk)
           → SentenceSplitter (512 tokens, 64 overlap)
           → HuggingFace BGE-small (384-dim, LOCAL — no API calls) → Neon pgvector DB
─────────────────────────────────────────────────────────────────────────────────────
User Question → HuggingFace BGE-small (LOCAL embed) → Cosine Search
             → Top-5 Chunks → Strict Grounded Prompt
             → Groq (temp=0.1) → Answer + Page Citations

See ARCHITECTURE.md for a detailed component diagram.

🚀 Quick Start (Local)

Prerequisites

Python 3.11+
A Neon.tech PostgreSQL database (free tier works)
A LlamaCloud API key (for LlamaParse)
A Groq API key (for llm only)

Note: No embedding API key needed — embeddings run locally via HuggingFace. The model (BAAI/bge-small-en-v1.5, ~133 MB) is downloaded automatically on first run.

1. Install dependencies

cd ai-doc-rag
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

2. Configure environment

cp .env.example .env
# Fill in GOOGLE_API_KEY, NEON_DATABASE_URL, LLAMA_CLOUD_API_KEY

Variable	Source
`GROQ_API_KEY`	Groq Console — LLM only
`NEON_DATABASE_URL`	Neon.tech — free PostgreSQL + pgvector
`LLAMA_CLOUD_API_KEY`	LlamaCloud — for LlamaParse

3. Run

streamlit run app/main.py

Open http://localhost:8501

4. Use the App

Upload any PDF document in the sidebar (e.g., Annual-Report-FY-2023-24.pdf)
Click "Process Document" — LlamaParse extracts tables + text as Markdown
Ask questions — e.g. "What was the total revenue in FY2024?"
Upload additional documents — they are added to the same index with metadata tagging

📚 Multi-Document Support

All documents share the same pgvector table (document_chunks_llama). Each chunk is tagged with a document_name in its metadata during ingestion, so:

Source chunks display which document they came from
The LLM is instructed to mention the source document when relevant
Duplicate uploads (same file hash) are rejected automatically

🐳 Docker

docker build -t doc-rag .
docker run -p 8501:8501 --env-file .env doc-rag

🧪 Tests

pytest tests/ -v

All tests run fully offline (external APIs mocked).

🛡 Hallucination Mitigation

Technique	Setting
Low temperature	`0.1`
Strict system prompt	Answer ONLY from context
Refusal instruction	Say "not available" if absent
Citation enforcement	Always cite page numbers
Table fidelity	LlamaParse preserves Markdown tables
Limited context	Top-5 chunks only
Document source	Prompt mentions document_name metadata

📁 Project Structure

ai-doc-rag/
├── app/
│   ├── main.py                  # Streamlit UI (chat + upload + multi-doc)
│   ├── config.py                # Config, env vars, URL helpers
│   ├── llm/
│   │   └── __init__.py          # Centralized LlamaIndex Settings
│   ├── ingestion/
│   │   └── pipeline.py          # LlamaParse → tag → chunk → embed → Neon
│   ├── retrieval/
│   │   └── query_engine.py      # Load index → QueryEngine + strict prompt
│   └── utils/
│       └── logger.py            # Structured logging
├── tests/
│   ├── test_ingestion.py        # Ingestion pipeline + metadata tests
│   └── test_retrieval.py        # Query engine + prompt tests
├── Dockerfile
├── requirements.txt
├── .env.example
├── ARCHITECTURE.md
└── README.md

🔧 Tech Stack

Layer	Technology
PDF Parsing	LlamaParse (cloud) via `llama-index-readers-llama-parse`
Chunking	LlamaIndex SentenceSplitter (512 tokens / 64 overlap)
Embeddings	HuggingFace `BAAI/bge-small-en-v1.5` (384-dim, LOCAL — no API key) via `llama-index-embeddings-huggingface`
Vector DB	Neon PostgreSQL + pgvector via `llama-index-vector-stores-postgres`
LLM	`groq` (temp=0.1) via `llama_index.llms.groq`
Framework	LlamaIndex Core
UI	Streamlit
Deployment	Docker

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📄 Document AI Q&A — RAG System

🏗 Architecture

🚀 Quick Start (Local)

Prerequisites

1. Install dependencies

2. Configure environment

3. Run

4. Use the App

📚 Multi-Document Support

🐳 Docker

🧪 Tests

🛡 Hallucination Mitigation

📁 Project Structure

🔧 Tech Stack

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.devcontainer		.devcontainer
app		app
tests		tests
.env.example		.env.example
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
Annual-Report-FY-2023-24.pdf		Annual-Report-FY-2023-24.pdf
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

📄 Document AI Q&A — RAG System

🏗 Architecture

🚀 Quick Start (Local)

Prerequisites

1. Install dependencies

2. Configure environment

3. Run

4. Use the App

📚 Multi-Document Support

🐳 Docker

🧪 Tests

🛡 Hallucination Mitigation

📁 Project Structure

🔧 Tech Stack

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages