Arabic-first AI platform for training, grounding, and serving reliable assistants.
Gazera combines QLoRA fine-tuning, retrieval-augmented generation (RAG), citation-aware responses, FastAPI serving, and a Next.js web app in one repo.
- Platform docs:
gazera/README.md - RunPod guide:
gazera/ops/runpod/README.md - API server:
gazera/serving/api/main.py - Web app:
gazera/ui/web/
- Arabic-first quality instead of Arabic-as-translation.
- Grounded answers with document citations for higher trust.
- Full lifecycle in one codebase: data prep, training, eval, API, and UI.
- Built to run on practical hardware (for example: RTX 4090 workflows).
gazera/: Core product and ML stack.gazera/ui/web/: Next.js web app.gazera/serving/api/: FastAPI endpoints for chat and RAG chat.gazera/training/: SFT and ORPO training scripts/configs.gazera/rag/: Chunking, indexing, retrieval, and citations.gazera/eval/: Evaluation harness and datasets.index.html: Public site entry page for this GitHub Pages repo.
| Capability | What it does | Where |
|---|---|---|
| Arabic SFT | QLoRA fine-tuning on Arabic instruction data | gazera/training/scripts/train_sft.py |
| Preference tuning | ORPO training for response quality shaping | gazera/training/scripts/train_orpo.py |
| Grounded QA | Retrieves relevant chunks and returns citations | gazera/rag/ + gazera/serving/api/rag_routes.py |
| Inference API | Serves /chat, /rag/chat, /health |
gazera/serving/api/ |
| Web experience | Next.js frontend for interacting with the model | gazera/ui/web/ |
| Evaluation | Task harness for QA and grounding checks | gazera/eval/ |
Data (JSONL + docs)
-> data/scripts/* (normalize, dedupe, split, validate)
-> training/scripts/* (SFT / ORPO with QLoRA)
-> serving/worker/* (Transformers or vLLM engine)
-> serving/api/* (/chat, /rag/chat)
-> ui/web/* (Next.js frontend)
Prerequisites:
- Python 3.11
- Node.js 18+
- Docker (for Qdrant in RAG mode)
git clone https://github.com/FayezBast/gazera-labs.github.io.git
cd gazera-labs.github.io/gazera
# Python env + deps + dev tooling
make setup
# Optional: create local env config
cp .env.example .env
# Optional: start vector DB for RAG
make rag_up
make ingest_docs
# Start API
make serve
# In a second terminal, start web UI
make uiAPI docs will be available at http://localhost:8000/docs.
Standard chat:
curl -X POST http://localhost:8000/chat \
-H "Content-Type: application/json" \
-d '{
"messages":[{"role":"user","content":"Marhaban, arrif binafsak"}],
"max_tokens":256,
"temperature":0.2
}'RAG chat with citations:
curl -X POST http://localhost:8000/rag/chat \
-H "Content-Type: application/json" \
-d '{
"messages":[{"role":"user","content":"What does the sample doc say?"}],
"max_tokens":256,
"temperature":0.2
}'cd gazera
# Supervised fine-tuning (QLoRA)
make train_sft
# Preference optimization
make train_orpo
# Merge adapter with base model
make mergeConfigs:
gazera/training/configs/sft_qwen25_7b_qlora.yamlgazera/training/configs/orpo_qwen25_7b_qlora.yaml
Use the RunPod guide for GPU cloud setup and scripts:
gazera/ops/runpod/README.mdgazera/ops/runpod/setup.shgazera/ops/runpod/train.sh
gazera/
data/ # datasets, prompts, preprocessing scripts
training/ # SFT / ORPO training + merge
rag/ # ingestion, retrieval, citations
serving/ # API + inference engines
ui/web/ # Next.js frontend
eval/ # eval harness and datasets
docs/ # roadmap, vision, model card, policy
- Keep secrets in local env files only (
.env,.env.*are ignored). - Use template files for sharing config (
.env.example,.env.runpod.example). - Do not commit credentials, private keys, or infrastructure state.
- Core technical README:
gazera/README.md - Vision:
gazera/docs/vision.md - Roadmap:
gazera/docs/roadmap.md - Model card:
gazera/docs/model_card.md - Contributing:
gazera/CONTRIBUTING.md
Gazera package metadata declares Apache-2.0 in gazera/pyproject.toml.