Document Q&A Bot that implements Retrieval-Augmented Generation (RAG). Users can upload PDF or text documents and ask questions about their content.
- Upload PDF and text documents (max 10MB)
- Ask natural language questions about uploaded content
- Real-time Q&A with source citations
- Vector-based document similarity search
- DeepSeek LLM integration for responses
- OpenAI embeddings for document processing
/
├── backend/ # Go API server
│ ├── cmd/
│ │ └── main.go # Application entry point
│ ├── internal/
│ │ ├── config/ # Configuration handling
│ │ ├── handlers/ # HTTP handlers
│ │ └── services/ # Business logic (RAG pipeline, document processing)
│ ├── pkg/
│ │ ├── types/ # Data structures
│ │ └── utils/ # Utilities
│ │ └── similarity/ # Similarity search algorithm
│ ├── go.mod
│ ├── Makefile
│ ├── .env.example
│ └── .gitignore
└── frontend/ # Next.js React application
├── src/
│ ├── app/ # Next.js pages
│ └── types/
├── package.json
├── tsconfig.json
├── .env.example
└── .gitignore
cd backend
# Install dependencies
go mod download
# Configure environment
cp .env.example .env
# Edit .env with your API keys:
# DEEPSEEK_API_KEY=your_deepseek_api_key_here
# OPENAI_API_KEY=your_openai_api_key_here
# PORT=3001
# Start development server
go run cmd/main.goThe backend will run on http://localhost:3001
cd frontend
# Install dependencies
npm install
# Configure environment
cp .env.example .env.local
# Edit .env.local with:
# NEXT_PUBLIC_BACKEND_URL=http://localhost:3001
# Start development server
npm run devThe frontend will run on http://localhost:3000
- POST
/api/upload- Upload and process documents - POST
/api/query- Ask questions about uploaded documents - GET
/health- Health check
DEEPSEEK_API_KEY- DeepSeek Chat API key for LLM responsesOPENAI_API_KEY- OpenAI API key for document embeddingsPORT- Server port (default: 3001)
NEXT_PUBLIC_BACKEND_URL- Backend API URL (default: http://localhost:3001)
- Go 1.24.4 - Programming language
- Gin - Web framework
- DeepSeek API - Language model for responses
- OpenAI Embeddings - Document vectorization
- ledongthuc/pdf - PDF text extraction
- In-memory Vector Store - Document similarity search
- Next.js - React framework
- React - UI library
- TypeScript - Type safety
- Tailwind CSS - Styling
- Document Upload: Files are processed and chunked into 1000-character segments with 200-character overlap
- Embedding: Text chunks are converted to vectors using OpenAI embeddings
- Storage: Vectors stored in memory (ephemeral - resets on restart)
- Query: User questions trigger similarity search to find relevant chunks
- Generation: DeepSeek LLM generates responses based on retrieved context
Frontend → Go Backend /api/upload → Document Processing → Vector Storage
Frontend → Go Backend /api/query → Similarity Search → DeepSeek LLM → Response