Enterprise RAG System

A production-ready RAG (Retrieval-Augmented Generation) system with a Go backend and React frontend. Supports multiple LLM providers (OpenRouter, AWS Bedrock, Ollama) with document ingestion, semantic search, and a modern web interface.

Features

Backend

🚀 High Performance - Built with Go and Fiber framework
🤖 Multi-Provider Support - OpenRouter, AWS Bedrock, and Ollama integration
📄 Document Management - Upload, process, and delete documents with automatic chunking
🔍 Semantic Search - Vector similarity search using cosine similarity
💾 Persistent Storage - BadgerDB for settings/metadata, JSON for vector storage
🔐 Encrypted Settings - AES-256 encryption for API keys
🎯 Flexible Embeddings - Support for Ollama, OpenRouter, and Bedrock embeddings
📊 Streaming Support - Real-time streaming responses for Bedrock
🔧 Configurable - System prompts, models, and chunking parameters
🛡️ Production-Ready - Structured logging, error handling, CORS, and graceful shutdown

Frontend

⚛️ Modern React - Built with React 19, TypeScript, and Vite
🎨 Beautiful UI - Tailwind CSS with shadcn/ui components
🌓 Dark/Light Mode - Theme switching with persistence
💬 Chat Interface - Real-time chat with context display
📁 File Upload - Drag-and-drop document upload
📋 Document List - View and delete uploaded documents
⚙️ Settings Management - API keys, system prompts, and provider configuration
📊 Token Metrics - Track input/output token usage

Quick Start

Prerequisites

Go 1.25+ (for backend)
Node.js 18+ (for frontend)
Ollama (optional, for local embeddings) - Install Ollama
API Keys (at least one):
- OpenRouter API key, or
- AWS Bedrock API key

Backend Setup

Clone the repository:

git clone https://github.com/mrkaynak/rag.git
cd rag

Install Go dependencies:

go mod download

Create .env file:

cp .env.example .env

Configure your environment variables (see Configuration section)
If using Ollama for embeddings, pull the embedding model:

ollama pull all-minilm:33m

Run the backend server:

go run cmd/server/main.go

The backend will start on http://localhost:3000

Frontend Setup

Navigate to frontend directory:

cd frontend

Install dependencies:

npm install

Configure frontend environment:

cp .env.example .env
# Edit .env and set VITE_API_BASE_URL if needed

Start development server:

npm run dev

The frontend will start on http://localhost:5173

Build for production:

# Set API URL for production
export VITE_API_BASE_URL=https://api.yourdomain.com/api/v1
npm run build

Note: The API URL can be configured via VITE_API_BASE_URL environment variable. Default is http://localhost:3000/api/v1

Quick Start with Ollama (No API Keys Required)

For a completely local setup without any API keys:

Install Ollama: https://ollama.ai
Pull models:

ollama pull all-minilm:33m  # For embeddings

Set in .env:

EMBEDDING_PROVIDER=ollama
EMBEDDING_MODEL=all-minilm:33m
OLLAMA_BASE_URL=http://localhost:11434

For LLM, you'll still need OpenRouter or Bedrock API key

API Documentation

Health & System

Health Check

GET /api/v1/health

Response:

{
  "status": "healthy",
  "version": "1.0.0"
}

Get System Prompt

GET /api/v1/system-prompt

Document Management

Upload Document

POST /api/v1/upload
Content-Type: multipart/form-data

file: @document.txt

Response:

{
  "document_id": "550e8400-e29b-41d4-a716-446655440000",
  "file_name": "document.txt",
  "chunk_count": 15
}

Example:

curl -X POST http://localhost:3000/api/v1/upload \
  -F "file=@knowledge.txt"

List Documents

GET /api/v1/documents

Delete Document

DELETE /api/v1/documents/:id

Chat

Chat (Non-streaming)

POST /api/v1/chat
Content-Type: application/json

{
  "message": "What is the main topic?",
  "provider": "openrouter",
  "model": "anthropic/claude-3.5-sonnet",
  "system_prompt": "Custom prompt (optional)"
}

Response:

{
  "message": "Based on the context...",
  "context": ["chunk1", "chunk2"]
}

Chat Stream (SSE)

POST /api/v1/chat/stream
Content-Type: application/json

{
  "message": "What is RAG?",
  "provider": "bedrock"
}

SSE Events:

context - Retrieved document chunks
chunk - Streaming text chunk
done - Stream completed
error - Error occurred

Settings

API Keys

# Save API keys (encrypted)
POST /api/v1/settings/api-keys
{
  "openrouter": "sk-...",
  "bedrock": "aws-..."
}

# Get API keys (masked)
GET /api/v1/settings/api-keys

Models

# Save model configuration
POST /api/v1/settings/models
{
  "provider": "openrouter",
  "model_id": "anthropic/claude-3.5-sonnet",
  "display_name": "Claude 3.5 Sonnet"
}

# List models
GET /api/v1/settings/models

# Delete model
DELETE /api/v1/settings/models/:id

System Prompts

# Save system prompt
POST /api/v1/settings/system-prompts
{
  "name": "Default",
  "prompt": "You are a helpful assistant...",
  "default": true
}

# List system prompts
GET /api/v1/settings/system-prompts

# Get default system prompt
GET /api/v1/settings/system-prompts/default

# Delete system prompt
DELETE /api/v1/settings/system-prompts/:id

Architecture

Project Structure

rag/
├── cmd/server/              # Application entrypoint
│   └── main.go
├── frontend/                # React frontend (React 19 + TypeScript)
│   ├── src/
│   │   ├── components/      # UI components (shadcn/ui)
│   │   │   ├── ui/          # Base UI components
│   │   │   ├── chat-interface.tsx
│   │   │   ├── file-upload.tsx
│   │   │   ├── documents-list.tsx
│   │   │   ├── api-keys-manager.tsx
│   │   │   ├── system-prompt-editor.tsx
│   │   │   └── token-metrics.tsx
│   │   ├── lib/            # API client & utilities
│   │   │   ├── api.ts      # Backend API client
│   │   │   └── utils.ts    # Helper functions
│   │   ├── hooks/          # React hooks
│   │   │   ├── use-theme.tsx
│   │   │   └── use-toast.ts
│   │   ├── App.tsx         # Main application
│   │   └── main.tsx        # Entry point
│   ├── package.json
│   ├── vite.config.ts
│   └── tailwind.config.js
├── internal/
│   ├── config/              # Environment configuration
│   │   └── config.go        # Config loading & validation
│   ├── handler/             # HTTP request handlers
│   │   ├── chat.go          # Chat & streaming endpoints
│   │   ├── upload.go        # Document upload & management
│   │   ├── settings.go      # Settings API
│   │   └── health.go        # Health check
│   ├── middleware/          # HTTP middleware
│   │   ├── cors.go          # CORS configuration
│   │   ├── logger.go        # Request logging
│   │   └── recovery.go      # Panic recovery
│   ├── models/              # Data structures
│   │   └── models.go        # Document, Chunk, Request/Response types
│   └── service/
│       ├── document/
│       │   ├── document.go  # Document processing & chunking
│       │   └── metadata.go  # Document metadata store (BadgerDB)
│       ├── embeddings/
│       │   └── embeddings.go # Multi-provider embeddings
│       ├── llm/
│       │   ├── openrouter.go # OpenRouter client
│       │   └── bedrock.go    # AWS Bedrock client (with streaming)
│       ├── settings/
│       │   ├── settings.go   # Settings store (BadgerDB, encrypted)
│       │   └── seed.go       # Initial data seeding
│       └── vector/
│           └── vector.go     # Vector similarity search (JSON)
├── pkg/
│   └── errors/              # Custom error types
├── data/                    # Persistent data (auto-created)
│   ├── uploads/             # Uploaded documents
│   ├── vectors/             # Vector embeddings (JSON)
│   └── badger/              # BadgerDB files
├── .env.example             # Example environment configuration
├── .env                     # Your configuration (gitignored)
├── go.mod                   # Go dependencies
└── README.md                # This file

How It Works

┌──────────────┐
│   Upload     │
│  Document    │
└──────┬───────┘
       │
       ▼
┌──────────────┐      ┌──────────────┐
│   Document   │─────▶│    Chunk     │
│   Service    │      │  (overlap)   │
└──────────────┘      └──────┬───────┘
                             │
                             ▼
                      ┌──────────────┐
                      │  Embeddings  │◀───Ollama/OpenRouter/Bedrock
                      │   Service    │
                      └──────┬───────┘
                             │
                             ▼
                      ┌──────────────┐
                      │   Vector     │
                      │    Store     │
                      └──────┬───────┘
                             │
        ┌────────────────────┴────────────────────┐
        │                                         │
        ▼                                         ▼
┌──────────────┐                          ┌──────────────┐
│    Query     │                          │   Metadata   │
│  (search)    │                          │    Store     │
└──────┬───────┘                          └──────────────┘
       │
       ▼
┌──────────────┐      ┌──────────────┐
│  Retrieve    │─────▶│     LLM      │◀───OpenRouter/Bedrock
│   Context    │      │  (augment)   │
└──────────────┘      └──────┬───────┘
                             │
                             ▼
                      ┌──────────────┐
                      │   Response   │
                      └──────────────┘

Flow:

Document Upload: Documents are uploaded and split into overlapping chunks (configurable size)
Embedding Generation: Each chunk is converted to a vector embedding via Ollama/OpenRouter/Bedrock
Vector Storage: Embeddings are stored in-memory with JSON persistence for fast similarity search
Metadata Storage: Document metadata (filename, size, chunk count) stored in BadgerDB
Query Processing: User questions are embedded and similar chunks retrieved using cosine similarity
Context Augmentation: Top K most relevant chunks are added to the LLM prompt as context
Response Generation: LLM generates answer based on retrieved context + user question
Streaming: For Bedrock, responses can be streamed in real-time via SSE

Technology Stack

Backend:

Go 1.25 - High-performance compiled language
Fiber v2 - Express-inspired web framework
BadgerDB v4 - Embedded key-value store for settings & metadata
Zap - Structured, leveled logging
AES-256-GCM - Encryption for sensitive API keys

Frontend:

React 19 - Latest React with concurrent features
TypeScript 5.8 - Type-safe development
Vite 7 - Fast build tool and dev server
Tailwind CSS 3.4 - Utility-first CSS framework
shadcn/ui - High-quality component library
Radix UI - Accessible primitives

Storage:

BadgerDB: Settings, API keys (encrypted), metadata
JSON: Vector embeddings (in-memory + file persistence)
File System: Uploaded documents

Configuration

Environment Variables

All configuration via .env file:

Variable	Description	Default	Required
Server
`PORT`	Server port	`3000`	No
`ENV`	Environment (development/production)	`development`	No
OpenRouter
`OPENROUTER_API_KEY`	OpenRouter API key	-	Yes*
`OPENROUTER_MODEL`	Default model	`anthropic/claude-3.5-sonnet`	No
AWS Bedrock
`BEDROCK_API_KEY`	AWS Bedrock API key	-	Yes*
`BEDROCK_REGION`	AWS region	`eu-north-1`	No
`BEDROCK_MODEL_ID`	Model ID	`openai.gpt-oss-20b-1:0`	No
Ollama
`OLLAMA_BASE_URL`	Ollama server URL	`http://localhost:11434`	No
Embeddings
`EMBEDDING_PROVIDER`	Provider: `ollama`, `openrouter`, `bedrock`	`ollama`	No
`EMBEDDING_MODEL`	Model name	`all-minilm:33m`	No
`EMBEDDING_DIMENSIONS`	Vector dimensions	`384`	No
Storage
`UPLOAD_DIR`	Upload directory	`./data/uploads`	No
`VECTOR_STORE_PATH`	Vector store path	`./data/vectors`	No
`BADGER_DB_PATH`	BadgerDB path	`./data/badger`	No
Encryption
`ENCRYPTION_KEY`	32-byte AES-256 key	-	Recommended
RAG
`MAX_CONTEXT_CHUNKS`	Max chunks in context	`5`	No
`CHUNK_SIZE`	Characters per chunk	`1000`	No
`CHUNK_OVERLAP`	Overlap between chunks	`200`	No
`SYSTEM_PROMPT`	Default system prompt	Built-in	No

* At least one LLM provider (OpenRouter or Bedrock) is required

Example Configurations

Ollama (Local, No API Keys for Embeddings):

PORT=3000
ENV=development

# Ollama for embeddings (local, no API key needed)
EMBEDDING_PROVIDER=ollama
EMBEDDING_MODEL=all-minilm:33m
EMBEDDING_DIMENSIONS=384
OLLAMA_BASE_URL=http://localhost:11434

# OpenRouter for LLM (API key required)
OPENROUTER_API_KEY=sk-or-v1-...
OPENROUTER_MODEL=anthropic/claude-3.5-sonnet

# Storage
UPLOAD_DIR=./data/uploads
VECTOR_STORE_PATH=./data/vectors
BADGER_DB_PATH=./data/badger

# Encryption (generate a secure 32-byte key)
ENCRYPTION_KEY=your-32-byte-encryption-key-here!!

# RAG settings
MAX_CONTEXT_CHUNKS=5
CHUNK_SIZE=1000
CHUNK_OVERLAP=200

OpenRouter (Cloud):

EMBEDDING_PROVIDER=openrouter
EMBEDDING_MODEL=openai/text-embedding-3-small
EMBEDDING_DIMENSIONS=1536
OPENROUTER_API_KEY=sk-or-v1-...
OPENROUTER_MODEL=anthropic/claude-3.5-sonnet

AWS Bedrock:

EMBEDDING_PROVIDER=bedrock
EMBEDDING_MODEL=amazon.titan-embed-text-v1
BEDROCK_API_KEY=your_aws_key
BEDROCK_REGION=us-east-1
BEDROCK_MODEL_ID=anthropic.claude-3-sonnet-20240229-v1:0

Production Deployment

Building

Backend:

# Build binary
go build -o rag-server cmd/server/main.go

# Run
./rag-server

Frontend:

cd frontend
npm run build
# Build output in frontend/dist/

Docker Deployment

Create Dockerfile:

# Stage 1: Build frontend
FROM node:18-alpine AS frontend-builder
WORKDIR /app/frontend
COPY frontend/package*.json ./
RUN npm ci
COPY frontend/ ./
RUN npm run build

# Stage 2: Build Go backend
FROM golang:1.25-alpine AS backend-builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN go build -o rag-server cmd/server/main.go

# Stage 3: Production
FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /app
COPY --from=backend-builder /app/rag-server .
COPY --from=frontend-builder /app/frontend/dist ./cmd/server/frontend/dist
COPY .env.example .env

# Create data directories
RUN mkdir -p data/uploads data/vectors data/badger

EXPOSE 3000
CMD ["./rag-server"]

Build and run:

docker build -t rag-system .
docker run -p 3000:3000 --env-file .env rag-system

Docker Compose

Create docker-compose.yml:

version: '3.8'

services:
  rag-server:
    build: .
    ports:
      - "3000:3000"
    env_file:
      - .env
    volumes:
      - ./data:/app/data
    restart: unless-stopped

  # Optional: Ollama for local embeddings
  ollama:
    image: ollama/ollama:latest
    ports:
      - "11434:11434"
    volumes:
      - ollama-data:/root/.ollama
    restart: unless-stopped

volumes:
  ollama-data:

Run:

docker-compose up -d

Performance Considerations

Vector Store: Currently uses in-memory storage with JSON persistence. For production at scale (100k+ documents), consider:
- PostgreSQL with pgvector
- Qdrant
- Weaviate
- Pinecone
Concurrency: The vector store uses read-write locks (sync.RWMutex) for thread-safe operations
File Size: Current implementation loads entire documents into memory. For large files (>10MB):
- Implement streaming file processing
- Add file size limits
- Process in batches
Chunking: Current implementation uses character-based chunking. Consider:
- Sentence-aware chunking
- Paragraph-based chunking
- Token-based chunking for better LLM compatibility
Rate Limiting: Add rate limiting middleware for production:

import "github.com/gofiber/fiber/v2/middleware/limiter"

app.Use(limiter.New(limiter.Config{
    Max:        100,
    Expiration: 1 * time.Minute,
}))

Security Best Practices

✅ HTTPS: Always use HTTPS in production (reverse proxy with nginx/Caddy)
✅ API Keys: Rotate API keys regularly
✅ Encryption: Use a strong 32-byte ENCRYPTION_KEY for AES-256
✅ Authentication: Implement authentication middleware for production

✅ File Upload: Validate file types and sizes:

if file.Size > 10*1024*1024 { // 10MB limit
    return errors.BadRequest("file too large")
}

✅ CORS: Configure CORS for specific origins in production
✅ Rate Limiting: Prevent abuse with rate limiting
✅ Input Validation: Sanitize all user inputs
✅ Monitoring: Log all API access and monitor for anomalies
✅ Environment Variables: Never commit .env to version control

Troubleshooting

Backend Issues

Port already in use:

# Change PORT in .env
PORT=3001

BadgerDB errors:

# Delete corrupted database
rm -rf data/badger
# Restart server (will recreate)

Ollama connection failed:

# Check if Ollama is running
ollama list

# Start Ollama
ollama serve

Frontend Issues

API connection failed:

Check backend is running on port 3000
Verify CORS settings in backend
Check browser console for errors

Build errors:

# Clear node_modules and reinstall
rm -rf node_modules package-lock.json
npm install

Contributing

Contributions are welcome! Please follow these steps:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Development Guidelines

Follow Go best practices and effective Go guidelines
Use TypeScript strict mode for frontend
Write meaningful commit messages
Add tests for new features
Update documentation

License

MIT License - see LICENSE file for details

Support

Issues: GitHub Issues
Discussions: GitHub Discussions

Roadmap

Acknowledgments

Fiber - Web framework
BadgerDB - Embedded database
shadcn/ui - UI components
Ollama - Local LLM inference
OpenRouter - LLM API aggregator
AWS Bedrock - Managed LLM service

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
cmd/server		cmd/server
frontend		frontend
internal		internal
pkg		pkg
.env.example		.env.example
.gitignore		.gitignore
.mcp.json		.mcp.json
README.md		README.md
go.mod		go.mod
go.sum		go.sum

M3R1ttt/go-rag

Folders and files

Latest commit

History

Repository files navigation

Enterprise RAG System

Features

Backend

Frontend

Quick Start

Prerequisites

Backend Setup

Frontend Setup

Quick Start with Ollama (No API Keys Required)

API Documentation

Health & System

Health Check

Get System Prompt

Document Management

Upload Document

List Documents

Delete Document

Chat

Chat (Non-streaming)

Chat Stream (SSE)

Settings

API Keys

Models

System Prompts

Architecture

Project Structure

How It Works

Technology Stack

Configuration

Environment Variables

Example Configurations

Production Deployment

Building

Docker Deployment

Docker Compose

Performance Considerations

Security Best Practices

Troubleshooting

Backend Issues

Frontend Issues

Contributing

Development Guidelines

License

Support

Roadmap

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages