Skip to content

Commit cdd7e1a

Browse files
committed
feat: add API server, Docker support, and tests
- Add FastAPI-based OpenAI-compatible API server - Add Dockerfile and Dockerfile-nvidia for containerization - Add unit tests for API and inference - Update CLI to support `serve` command - Update requirements.txt - Update README.md with usage instructions and badges
1 parent cdb05cf commit cdd7e1a

File tree

10 files changed

+684
-6
lines changed

10 files changed

+684
-6
lines changed

Dockerfile

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
FROM python:3.10-slim
2+
3+
# Set environment variables
4+
ENV PYTHONUNBUFFERED=1 \
5+
PYTHONDONTWRITEBYTECODE=1
6+
7+
# Install system dependencies
8+
RUN apt-get update && apt-get install -y \
9+
git \
10+
build-essential \
11+
&& rm -rf /var/lib/apt/lists/*
12+
13+
# Set working directory
14+
WORKDIR /app
15+
16+
# Create a user first to handle permissions correctly from the start
17+
RUN useradd -m -u 1000 user
18+
19+
# Switch to user
20+
USER user
21+
ENV HOME=/home/user \
22+
PATH=/home/user/.local/bin:$PATH
23+
24+
# Set up application directory with correct permissions
25+
WORKDIR $HOME/app
26+
27+
# Copy requirements and install
28+
COPY --chown=user requirements.txt .
29+
RUN pip install --no-cache-dir -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cpu
30+
31+
# Copy application code
32+
COPY --chown=user . .
33+
34+
# Expose port
35+
EXPOSE 7860
36+
37+
# Command to run the application
38+
CMD ["python3", "-m", "aetheris.cli.main", "serve", "--host", "0.0.0.0", "--port", "7860"]

Dockerfile-nvidia

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# Use NVIDIA CUDA base image for GPU support
2+
FROM nvidia/cuda:12.1.0-runtime-ubuntu22.04
3+
4+
# Set environment variables
5+
ENV PYTHONUNBUFFERED=1 \
6+
PYTHONDONTWRITEBYTECODE=1 \
7+
DEBIAN_FRONTEND=noninteractive
8+
9+
# Install system dependencies
10+
RUN apt-get update && apt-get install -y \
11+
python3-pip \
12+
python3-dev \
13+
git \
14+
&& rm -rf /var/lib/apt/lists/*
15+
16+
# Set working directory
17+
WORKDIR /app
18+
19+
# Install Python dependencies
20+
COPY requirements.txt .
21+
RUN pip3 install --no-cache-dir -r requirements.txt
22+
23+
# Copy application code
24+
COPY . .
25+
26+
# Expose port (7860 is default for Hugging Face Spaces)
27+
EXPOSE 7860
28+
29+
# Create a user to avoid running as root (good practice, also sometimes required by HF)
30+
# But often HF runs as user 1000.
31+
RUN useradd -m -u 1000 user
32+
USER user
33+
ENV HOME=/home/user \
34+
PATH=/home/user/.local/bin:$PATH
35+
36+
WORKDIR $HOME/app
37+
COPY --chown=user . $HOME/app
38+
39+
# Command to run the application
40+
# We use the CLI serve command we added
41+
CMD ["python3", "-m", "aetheris.cli.main", "serve", "--host", "0.0.0.0", "--port", "7860"]

README.md

Lines changed: 50 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,12 @@
33
<p align="center">
44
<img src="https://img.shields.io/badge/Status-Experimental-yellow.svg" alt="Status">
55
<img src="https://img.shields.io/badge/License-MIT-green.svg" alt="License">
6-
<img src="https://img.shields.io/badge/Python-3.8+-blue.svg" alt="Python">
6+
<img src="https://img.shields.io/badge/Python-3.10+-blue.svg" alt="Python">
77
<img src="https://img.shields.io/badge/PyTorch-2.0+-orange.svg" alt="PyTorch">
8+
<img src="https://img.shields.io/badge/API-FastAPI-009688.svg" alt="FastAPI">
89
</p>
910

1011

11-
1212
**Aetheris** is a hobbyist research project and experimental implementation exploring the intersection of **State Space Models (Mamba)** and **Mixture of Experts (MoE)**.
1313

1414
The goal of this project was to learn by doing: attempting to combine the linear-time inference of Mamba with the sparse scaling capacity of MoE from scratch in PyTorch. It is designed as a playground for understanding these modern architectures, not as a published academic paper or production-ready foundation model.
@@ -36,15 +36,31 @@ This code is provided for educational purposes and for others who want to experi
3636

3737
### Installation
3838

39+
**Option 1: Local Python Environment**
40+
3941
```bash
4042
git clone https://github.com/Pomilon/Aetheris.git
4143
cd Aetheris
4244
pip install -r requirements.txt
43-
````
45+
```
46+
47+
**Option 2: Docker**
48+
49+
We provide Dockerfiles for both CPU (slim) and GPU (NVIDIA) environments.
50+
51+
```bash
52+
# CPU Version
53+
docker build -t aetheris-cpu -f Dockerfile .
54+
docker run -p 7860:7860 aetheris-cpu
55+
56+
# GPU Version (Requires NVIDIA Container Toolkit)
57+
docker build -t aetheris-gpu -f Dockerfile-nvidia .
58+
docker run --gpus all -p 7860:7860 aetheris-gpu
59+
```
4460

4561
### Usage (CLI)
4662

47-
Aetheris includes a CLI to train or inference the model.
63+
Aetheris includes a CLI to train, inference, or serve the model.
4864

4965
**1. Training (From Scratch)**
5066

@@ -53,12 +69,40 @@ Aetheris includes a CLI to train or inference the model.
5369
python -m aetheris.cli.main train --config configs/default.yaml
5470
```
5571

56-
**2. Generation**
72+
**2. Generation (CLI)**
5773

5874
```bash
5975
python -m aetheris.cli.main generate --prompt "The quick brown fox" --checkpoint_dir checkpoints
6076
```
6177

78+
**3. API Server (OpenAI-Compatible)**
79+
80+
Start a local API server that simulates OpenAI's chat completions endpoint.
81+
82+
```bash
83+
python -m aetheris.cli.main serve --host 0.0.0.0 --port 8000
84+
```
85+
86+
You can then interact with it using standard tools:
87+
88+
```bash
89+
curl http://localhost:8000/v1/chat/completions \
90+
-H "Content-Type: application/json" \
91+
-d {
92+
"model": "aetheris-hybrid",
93+
"messages": [{"role": "user", "content": "Hello!"}],
94+
"stream": true
95+
}
96+
```
97+
98+
### Development & Testing
99+
100+
To run the test suite:
101+
102+
```bash
103+
pytest tests/
104+
```
105+
62106
## ⚙️ Configuration
63107

64108
You can tweak the hyperparameters in `configs/`. I've included a "Debug" config that is small enough to train on a laptop CPU for testing the code flow.
@@ -99,4 +143,4 @@ python -m aetheris.cli.main generate --prompt "The quick brown fox" --checkpoint
99143
100144
## License
101145

102-
MIT
146+
MIT

aetheris/api/schemas.py

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
from typing import List, Optional, Union, Dict, Any
2+
from pydantic import BaseModel, Field
3+
import time
4+
5+
class ChatMessage(BaseModel):
6+
role: str
7+
content: str
8+
9+
class ChatCompletionRequest(BaseModel):
10+
model: str
11+
messages: List[ChatMessage]
12+
temperature: Optional[float] = 1.0
13+
top_p: Optional[float] = 1.0
14+
n: Optional[int] = 1
15+
stream: Optional[bool] = False
16+
stop: Optional[Union[str, List[str]]] = None
17+
max_tokens: Optional[int] = None
18+
presence_penalty: Optional[float] = 0.0
19+
frequency_penalty: Optional[float] = 0.0
20+
logit_bias: Optional[Dict[str, float]] = None
21+
user: Optional[str] = None
22+
23+
class ChatCompletionChoice(BaseModel):
24+
index: int
25+
message: ChatMessage
26+
finish_reason: Optional[str] = None
27+
28+
class ChatCompletionResponse(BaseModel):
29+
id: str
30+
object: str = "chat.completion"
31+
created: int = Field(default_factory=lambda: int(time.time()))
32+
model: str
33+
choices: List[ChatCompletionChoice]
34+
usage: Optional[Dict[str, int]] = None
35+
36+
class ChatCompletionChunkDelta(BaseModel):
37+
role: Optional[str] = None
38+
content: Optional[str] = None
39+
40+
class ChatCompletionChunkChoice(BaseModel):
41+
index: int
42+
delta: ChatCompletionChunkDelta
43+
finish_reason: Optional[str] = None
44+
45+
class ChatCompletionChunk(BaseModel):
46+
id: str
47+
object: str = "chat.completion.chunk"
48+
created: int = Field(default_factory=lambda: int(time.time()))
49+
model: str
50+
choices: List[ChatCompletionChunkChoice]
51+
52+
class CompletionRequest(BaseModel):
53+
model: str
54+
prompt: Union[str, List[str]]
55+
suffix: Optional[str] = None
56+
max_tokens: Optional[int] = 16
57+
temperature: Optional[float] = 1.0
58+
top_p: Optional[float] = 1.0
59+
n: Optional[int] = 1
60+
stream: Optional[bool] = False
61+
logprobs: Optional[int] = None
62+
echo: Optional[bool] = False
63+
stop: Optional[Union[str, List[str]]] = None
64+
presence_penalty: Optional[float] = 0.0
65+
frequency_penalty: Optional[float] = 0.0
66+
best_of: Optional[int] = 1
67+
logit_bias: Optional[Dict[str, float]] = None
68+
user: Optional[str] = None
69+
70+
class CompletionChoice(BaseModel):
71+
text: str
72+
index: int
73+
logprobs: Optional[Any] = None
74+
finish_reason: Optional[str] = None
75+
76+
class CompletionResponse(BaseModel):
77+
id: str
78+
object: str = "text_completion"
79+
created: int = Field(default_factory=lambda: int(time.time()))
80+
model: str
81+
choices: List[CompletionChoice]
82+
usage: Optional[Dict[str, int]] = None
83+
84+
class ModelCard(BaseModel):
85+
id: str
86+
object: str = "model"
87+
created: int = Field(default_factory=lambda: int(time.time()))
88+
owned_by: str = "aetheris"
89+
90+
class ModelList(BaseModel):
91+
object: str = "list"
92+
data: List[ModelCard]

0 commit comments

Comments
 (0)