🧠 AI Knowledge API

A FastAPI-based backend service that enables users to upload text-based documents and interact with their content through natural language queries using Retrieval-Augmented Generation (RAG).

✨ Features

📄 Document Upload: Support for PDF, DOCX, and TXT files
🔍 Semantic Search: Uses Hugging Face embeddings for intelligent document retrieval
🤖 AI-Powered Q&A: Answers questions based on your uploaded documents
💾 Local Vector Storage: ChromaDB for efficient semantic search
🚀 Fast & Modern: Built with FastAPI for high performance

📚 Documentation

Architecture Guide - Detailed system architecture and design decisions
Contributing Guide - Guidelines for contributing to the project
Security Guide - API key management and security best practices
Testing Guide - How to run and write tests
Examples - Usage examples and sample documents
Setup Scripts - Automated setup instructions
Changelog - Version history and release notes

📌 Version

Current Version: 1.0.0

This project follows Semantic Versioning with single source of truth in the VERSION file.

Version Management

# Check current version
python scripts/version.py

# Bump versions
python scripts/version.py patch   # 1.0.0 → 1.0.1 (bug fixes)
python scripts/version.py minor   # 1.0.0 → 1.1.0 (new features)
python scripts/version.py major   # 1.0.0 → 2.0.0 (breaking changes)

# Set specific version
python scripts/version.py set 2.0.0

Single Source of Truth:

Version defined in: VERSION file
Auto-read by: src/__init__.py, pyproject.toml, FastAPI
Exposed via: GET /health endpoint
See: CHANGELOG.md for release history

🏗️ Architecture

┌─────────────┐
│   Client    │
└──────┬──────┘
       │
       ▼
┌─────────────┐
│  FastAPI    │
│  Endpoints  │
└──────┬──────┘
       │
       ├──────────────┬──────────────┐
       ▼              ▼              ▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│   Upload   │ │   Query    │ │   Health   │
│   Router   │ │   Router   │ │  Endpoint  │
└──────┬─────┘ └──────┬─────┘ └────────────┘
       │              │
       ▼              ▼
┌────────────────────────────┐
│      RAG Engine            │
│  - Document Chunking       │
│  - Embedding Generation    │
│  - Semantic Retrieval      │
│  - LLM Answer Generation   │
└──────┬─────────────────────┘
       │
       ├──────────────┬──────────────┐
       ▼              ▼              ▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│  ChromaDB  │ │ Hugging    │ │ Sentence   │
│   Vector   │ │   Face     │ │Transformers│
│    Store   │ │    LLM     │ │ Embeddings │
└────────────┘ └────────────┘ └────────────┘

📋 Prerequisites

Python 3.9+
4GB+ RAM recommended
(Optional) Hugging Face API key for larger models

🚀 Quick Start

Option 1: Automated Setup (Recommended)

Linux/Mac:

git clone <repo-url>
cd ai-knowledge-api
chmod +x scripts/setup.sh
./scripts/setup.sh

Windows:

git clone <repo-url>
cd ai-knowledge-api
scripts\setup.bat

Option 2: Manual Setup

1. Clone the Repository

git clone <repo-url>
cd ai-knowledge-api

2. Create Virtual Environment

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

3. Install Dependencies

pip install -r requirements.txt

4. Configure Environment

cp .env.example .env

Edit .env with your settings:

HF_API_KEY=huggingface_key_here
MODEL_EMBEDDING=sentence-transformers/all-MiniLM-L6-v2
MODEL_LLM=microsoft/phi-2

Note: See docs/SECURITY.md for best practices on managing API keys securely.

The only required variable is HF_API_KEY if you plan to use Hugging Face models that need authentication.

5. Run the API

uvicorn src.main:app --reload --host 0.0.0.0 --port 7860

The API will be available at:

Interactive Docs: http://localhost:7860
ReDoc: http://localhost:7860/redoc
Health Check: http://localhost:7860/health

📚 API Endpoints

Health Check

GET /health

Check if the API is running.

curl http://localhost:7860/health

Response:

{
  "status": "ok",
  "message": "AI Knowledge API is running"
}

Upload Document (JSON)

POST /upload/text

Upload text content directly.

curl -X POST http://localhost:7860/upload/text \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Artificial Intelligence is transforming the world. Machine learning enables computers to learn from data."
  }'

Response:

{
  "message": "Document indexed successfully",
  "chunks_stored": 1
}

Upload Document (File)

POST /upload

Upload a document file.

# Upload a text file
curl -X POST http://localhost:7860/upload \
  -F "file=@document.txt"

# Upload a PDF file
curl -X POST http://localhost:7860/upload \
  -F "file=@document.pdf"

# Upload with text form data
curl -X POST http://localhost:7860/upload \
  -F "text=Your document content here"

Supported File Types:

.txt - Plain text
.pdf - PDF documents
.docx - Microsoft Word documents

Query Documents

POST /query

Ask a question about uploaded documents.

curl -X POST http://localhost:7860/query \
  -H "Content-Type: application/json" \
  -d '{
    "question": "What is artificial intelligence?"
  }'

Response:

{
  "answer": "Artificial Intelligence is a field that focuses on creating systems capable of learning and making decisions...",
  "context": [
    "Artificial Intelligence is transforming the world.",
    "Machine learning enables computers to learn from data."
  ]
}

Get Statistics

GET /query/stats

Get information about indexed documents.

curl http://localhost:7860/query/stats

Response:

{
  "total_chunks": 15,
  "status": "ready"
}

Clear Database

DELETE /query/clear

Remove all indexed documents.

curl -X DELETE http://localhost:7860/query/clear

Response:

{
  "message": "Database cleared successfully",
  "total_chunks": 0
}

🧪 Testing

Run Automated Tests

# Make sure the server is running first
uvicorn src.main:app --reload

# In another terminal, run tests
python tests/test_api.py

See tests/README.md for more details on testing.

Manual Testing with Python

import requests

# Base URL
BASE_URL = "http://localhost:7860"

# 1. Check health
response = requests.get(f"{BASE_URL}/health")
print(response.json())

# 2. Upload text
response = requests.post(
    f"{BASE_URL}/upload/text",
    json={
        "text": "Python is a high-level programming language. It is widely used for web development, data science, and AI."
    }
)
print(response.json())

# 3. Query
response = requests.post(
    f"{BASE_URL}/query",
    json={
        "question": "What is Python used for?"
    }
)
print(response.json())

Using Example Files

Try uploading the sample document:

curl -X POST http://localhost:7860/upload \
  -F "file=@examples/sample_document.txt"

See examples/README.md for more usage examples.

🐳 Docker Deployment

Build Docker Image

docker build -t ai-knowledge-api .

Run Container

docker run -p 7860:7860 \
  -e HF_API_KEY=your_key \
  -v $(pwd)/data:/app/data \
  ai-knowledge-api

☁️ Hugging Face Spaces Deployment

Create a new Space on Hugging Face
Select Docker as SDK
Push your code to the Space repository:

git remote add hf https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE
git push hf main

Add secrets in Space settings:
- HF_API_KEY: Your Hugging Face API key

📁 Project Structure

ai-knowledge-api/
├── 📄 Configuration Files (root)
│   ├── requirements.txt       # Python dependencies
│   ├── Dockerfile             # Docker configuration
│   ├── docker-compose.yml     # Docker Compose setup
│   ├── .env.example           # Environment variables template
│   ├── .gitignore
│   ├── LICENSE
│   └── README.md
│
├── 💻 src/                    # Application source code
│   ├── __init__.py
│   ├── README.md              # Source code documentation
│   ├── main.py                # FastAPI application entry point
│   │
│   ├── routers/               # API route handlers
│   │   ├── __init__.py
│   │   ├── upload.py          # Document upload endpoints
│   │   └── query.py           # Query endpoints
│   │
│   ├── services/              # Business logic services
│   │   ├── __init__.py
│   │   ├── embeddings.py      # Embedding generation service
│   │   ├── rag_engine.py      # RAG logic and vector database
│   │   └── document_processor.py  # Document text extraction
│   │
│   └── models/                # Data models
│       ├── __init__.py
│       └── schemas.py         # Pydantic models
│
├── 💾 data/                   # Application data
│   └── vectors/               # ChromaDB storage (persisted)
│
├── 📚 docs/                   # Documentation files
│   ├── README.md              # Documentation overview
│   ├── ARCHITECTURE.md        # System architecture details
│   ├── CONTRIBUTING.md        # Contribution guidelines
│   └── SECURITY.md            # Security best practices
│
├── 🔧 scripts/                # Setup and utility scripts
│   ├── README.md              # Scripts documentation
│   ├── setup.sh               # Linux/Mac setup script
│   └── setup.bat              # Windows setup script
│
├── 🧪 tests/                  # Test files
│   ├── README.md              # Testing documentation
│   └── test_api.py            # API endpoint tests
│
├── 📋 examples/               # Sample files and usage examples
│   ├── README.md              # Examples documentation
│   └── sample_document.txt    # Sample document for testing
│
└── 📊 logs/                   # Application logs (git-ignored)
    └── api.log                # Server logs

🔧 Configuration

Environment Variables

Variable	Description	Default
`HF_API_KEY`	Hugging Face API key	-
`DB_DIR`	Vector database directory	`./data/vectors`
`MODEL_EMBEDDING`	Embedding model	`sentence-transformers/all-MiniLM-L6-v2`
`MODEL_LLM`	Language model for Q&A	`microsoft/phi-2`

Customizing Models

You can change the models in .env:

Embedding Models (smaller = faster, larger = more accurate):

sentence-transformers/all-MiniLM-L6-v2 (default, 80MB)
sentence-transformers/all-mpnet-base-v2 (better quality, 420MB)

LLM Models:

microsoft/phi-2 (default, works without GPU)
mistralai/Mistral-7B-Instruct-v0.1 (better quality, requires GPU)

🔒 Security Considerations

File Upload Limits: Implement file size limits in production
Rate Limiting: Add rate limiting for API endpoints
Authentication: Add API key authentication for production use
Input Sanitization: Already implemented for file types
CORS: Configure specific origins in production

🐛 Troubleshooting

Issue: Out of Memory

Solution: Use smaller models or increase system RAM.

MODEL_LLM=microsoft/phi-2

Issue: Slow Response Times

Solution:

Models are loaded lazily on first use (expected initial delay)
Consider using GPU acceleration
Reduce top_k in retrieval

Issue: Import Errors

Solution: Reinstall dependencies

pip install --upgrade -r requirements.txt

📈 Performance Optimization

Use GPU: Install PyTorch with CUDA for faster inference
Batch Processing: Upload multiple documents at once
Caching: Models are cached after first load
Quantization: Use quantized models for lower memory usage

🤝 Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch
Make your changes
Submit a pull request

📄 License

This project is licensed under the MIT License.

🙏 Acknowledgments

FastAPI - Modern web framework
LangChain - Document processing utilities
Hugging Face - Pre-trained models
ChromaDB - Vector database
Sentence Transformers - Embedding models

📞 Support

For issues and questions:

Open an issue on GitHub
Check the FastAPI documentation
Visit Hugging Face documentation

Built with ❤️ using FastAPI and Hugging Face

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data/vectors		data/vectors
docs		docs
examples		examples
scripts		scripts
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
VERSION		VERSION
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

License

wescosta/ai-knowledge-api

Folders and files

Latest commit

History

Repository files navigation

🧠 AI Knowledge API

✨ Features

📚 Documentation

📌 Version

Version Management

🏗️ Architecture

📋 Prerequisites

🚀 Quick Start

Option 1: Automated Setup (Recommended)

Linux/Mac:

Windows:

Option 2: Manual Setup

1. Clone the Repository

2. Create Virtual Environment

3. Install Dependencies

4. Configure Environment

5. Run the API

📚 API Endpoints

Health Check

Upload Document (JSON)

Upload Document (File)

Query Documents

Get Statistics

Clear Database

🧪 Testing

Run Automated Tests

Manual Testing with Python

Using Example Files

🐳 Docker Deployment

Build Docker Image

Run Container

☁️ Hugging Face Spaces Deployment

📁 Project Structure

🔧 Configuration

Environment Variables

Customizing Models

🔒 Security Considerations

🐛 Troubleshooting

Issue: Out of Memory

Issue: Slow Response Times

Issue: Import Errors

📈 Performance Optimization

🤝 Contributing

📄 License

🙏 Acknowledgments

📞 Support

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages