📄 Smart Document Q&A – Technical Overview & Progress Log

✅ Objective

To build a real-world RAG (Retrieval-Augmented Generation) system that can:

Index a collection of PDF documents
Allow users to ask natural language questions
Retrieve relevant context and generate coherent, grounded answers using an LLM

🏗️ System Architecture

1. Backend (FastAPI)

Accepts PDF uploads via /upload
Extracts and chunks text into smaller segments
Embeds each chunk using all-MiniLM-L6-v2 from sentence-transformers
Stores vectors in a FAISS index (inner product for cosine similarity)
Metadata (chunk, source filename, etc.) saved in JSON
/ask endpoint performs:
- Query embedding
- FAISS retrieval of top-k chunks
- Prompt construction with context
- LLM call via local llama-simple binary (gguf model)
- Return of answer

2. Frontend (Next.js + Tailwind)

Simple interface for:
- Typing a question
- Viewing a loading state
- Displaying final answers and sources
Calls /ask endpoint from backend via lib/api.ts

🧠 What Works Well

✅ Document ingestion works for PDFs.
✅ Chunking and embedding pipeline is functional.
✅ FAISS retrieval gives relevant chunks.
✅ llama-simple can generate responses based on prompt.
✅ Frontend successfully queries backend and shows answers.

⚠️ Issues & Limitations

🔸 Prompt sometimes misleads the LLM into hallucinations.
🔸 Answers were occasionally not grounded in retrieved content.
🔸 LLM was not reliably using source context until we structured the prompt better.
🔸 The local model (llama-simple) isn't fine-tuned for QA.
🔸 No reranking or source-based highlighting yet.
🔸 Some responses were poorly formatted (e.g., odd punctuation, clipped endings).

🔧 Recent Improvements

✅ Step-by-step RAG architecture implemented
✅ Added prompt engineering to improve grounding
✅ Began parsing structured output (JSON-formatted answers)
✅ Embedded chunk metadata into prompts for explainability
✅ Clear separation of backend (retrieval & LLM) vs frontend

📦 Data & Models Used

Model: all-MiniLM-L6-v2 (for embeddings)
LLM: Quantized LLaMA-2 7B (via llama.cpp)
Indexing: FAISS (flat inner product)
Docs: Local PDF files (manually uploaded)

📅 Next Steps (Planned)

**Reranking using BAAI/bge-m3 or Cohere Reranker
**
Better structured output from LLM (JSON + citations)
Citation highlighting on frontend (clickable chunks)
Multi-doc support UI (PDF-specific filtering)
Model fallback or hybrid LLM options (e.g., OpenAI for better answers)
Upload interface & file explorer

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📄 Smart Document Q&A – Technical Overview & Progress Log

✅ Objective

🏗️ System Architecture

1. Backend (FastAPI)

2. Frontend (Next.js + Tailwind)

🧠 What Works Well

⚠️ Issues & Limitations

🔧 Recent Improvements

📦 Data & Models Used

📅 Next Steps (Planned)

About

Uh oh!

Releases

Packages

Uh oh!

Languages

brownsloth/smart-document-qa

Folders and files

Latest commit

History

Repository files navigation

📄 Smart Document Q&A – Technical Overview & Progress Log

✅ Objective

🏗️ System Architecture

1. Backend (FastAPI)

2. Frontend (Next.js + Tailwind)

🧠 What Works Well

⚠️ Issues & Limitations

🔧 Recent Improvements

📦 Data & Models Used

📅 Next Steps (Planned)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages