Pomilon-Intelligence-Lab
diff --git a/‎Dockerfile‎
Lines changed: 38 additions & 0 deletions b/‎Dockerfile‎
Lines changed: 38 additions & 0 deletions
diff --git a/‎Dockerfile-nvidia‎
Lines changed: 41 additions & 0 deletions b/‎Dockerfile-nvidia‎
Lines changed: 41 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 50 additions & 6 deletions b/‎README.md‎
Lines changed: 50 additions & 6 deletions
diff --git a/‎aetheris/api/schemas.py‎
Lines changed: 92 additions & 0 deletions b/‎aetheris/api/schemas.py‎
Lines changed: 92 additions & 0 deletions
@@ -0,0 +1,38 @@
+FROM python:3.10-slim
+
+# Set environment variables
+ENV PYTHONUNBUFFERED=1 \
+    PYTHONDONTWRITEBYTECODE=1
+
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    git \
+    build-essential \
+    && rm -rf /var/lib/apt/lists/*
+
+# Set working directory
+WORKDIR /app
+
+# Create a user first to handle permissions correctly from the start
+RUN useradd -m -u 1000 user
+
+# Switch to user
+USER user
+ENV HOME=/home/user \
+    PATH=/home/user/.local/bin:$PATH
+
+# Set up application directory with correct permissions
+WORKDIR $HOME/app
+
+# Copy requirements and install
+COPY --chown=user requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cpu
+
+# Copy application code
+COPY --chown=user . .
+
+# Expose port
+EXPOSE 7860
+
+# Command to run the application
+CMD ["python3", "-m", "aetheris.cli.main", "serve", "--host", "0.0.0.0", "--port", "7860"]
@@ -0,0 +1,41 @@
+# Use NVIDIA CUDA base image for GPU support
+FROM nvidia/cuda:12.1.0-runtime-ubuntu22.04
+
+# Set environment variables
+ENV PYTHONUNBUFFERED=1 \
+    PYTHONDONTWRITEBYTECODE=1 \
+    DEBIAN_FRONTEND=noninteractive
+
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    python3-pip \
+    python3-dev \
+    git \
+    && rm -rf /var/lib/apt/lists/*
+
+# Set working directory
+WORKDIR /app
+
+# Install Python dependencies
+COPY requirements.txt .
+RUN pip3 install --no-cache-dir -r requirements.txt
+
+# Copy application code
+COPY . .
+
+# Expose port (7860 is default for Hugging Face Spaces)
+EXPOSE 7860
+
+# Create a user to avoid running as root (good practice, also sometimes required by HF)
+# But often HF runs as user 1000.
+RUN useradd -m -u 1000 user
+USER user
+ENV HOME=/home/user \
+    PATH=/home/user/.local/bin:$PATH
+
+WORKDIR $HOME/app
+COPY --chown=user . $HOME/app
+
+# Command to run the application
+# We use the CLI serve command we added
+CMD ["python3", "-m", "aetheris.cli.main", "serve", "--host", "0.0.0.0", "--port", "7860"]
@@ -3,12 +3,12 @@
 <p align="center">
   <img src="https://img.shields.io/badge/Status-Experimental-yellow.svg" alt="Status">
   <img src="https://img.shields.io/badge/License-MIT-green.svg" alt="License">
-  <img src="https://img.shields.io/badge/Python-3.8+-blue.svg" alt="Python">
+  <img src="https://img.shields.io/badge/Python-3.10+-blue.svg" alt="Python">
   <img src="https://img.shields.io/badge/PyTorch-2.0+-orange.svg" alt="PyTorch">
+  <img src="https://img.shields.io/badge/API-FastAPI-009688.svg" alt="FastAPI">
 </p>
 
 
-
 **Aetheris** is a hobbyist research project and experimental implementation exploring the intersection of **State Space Models (Mamba)** and **Mixture of Experts (MoE)**.
 
 The goal of this project was to learn by doing: attempting to combine the linear-time inference of Mamba with the sparse scaling capacity of MoE from scratch in PyTorch. It is designed as a playground for understanding these modern architectures, not as a published academic paper or production-ready foundation model.
@@ -36,15 +36,31 @@ This code is provided for educational purposes and for others who want to experi
 
 ### Installation
 
+**Option 1: Local Python Environment**
+
 ```bash
 git clone https://github.com/Pomilon/Aetheris.git
 cd Aetheris
 pip install -r requirements.txt
-````
+```
+
+**Option 2: Docker**
+
+We provide Dockerfiles for both CPU (slim) and GPU (NVIDIA) environments.
+
+```bash
+# CPU Version
+docker build -t aetheris-cpu -f Dockerfile .
+docker run -p 7860:7860 aetheris-cpu
+
+# GPU Version (Requires NVIDIA Container Toolkit)
+docker build -t aetheris-gpu -f Dockerfile-nvidia .
+docker run --gpus all -p 7860:7860 aetheris-gpu
+```
 
 ### Usage (CLI)
 
-Aetheris includes a CLI to train or inference the model.
+Aetheris includes a CLI to train, inference, or serve the model.
 
 **1. Training (From Scratch)**
 
@@ -53,12 +69,40 @@ Aetheris includes a CLI to train or inference the model.
 python -m aetheris.cli.main train --config configs/default.yaml
 ```
 
-**2. Generation**
+**2. Generation (CLI)**
 
 ```bash
 python -m aetheris.cli.main generate --prompt "The quick brown fox" --checkpoint_dir checkpoints
 ```
 
+**3. API Server (OpenAI-Compatible)**
+
+Start a local API server that simulates OpenAI's chat completions endpoint.
+
+```bash
+python -m aetheris.cli.main serve --host 0.0.0.0 --port 8000
+```
+
+You can then interact with it using standard tools:
+
+```bash
+curl http://localhost:8000/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d 	{
+    "model": "aetheris-hybrid",
+    "messages": [{"role": "user", "content": "Hello!"}],
+    "stream": true
+  }
+```
+
+### Development & Testing
+
+To run the test suite:
+
+```bash
+pytest tests/
+```
+
 ## ⚙️ Configuration
 
 You can tweak the hyperparameters in `configs/`. I've included a "Debug" config that is small enough to train on a laptop CPU for testing the code flow.
@@ -99,4 +143,4 @@ python -m aetheris.cli.main generate --prompt "The quick brown fox" --checkpoint
 
 ## License
 
-MIT
+MIT
@@ -0,0 +1,92 @@
+from typing import List, Optional, Union, Dict, Any
+from pydantic import BaseModel, Field
+import time
+
+class ChatMessage(BaseModel):
+    role: str
+    content: str
+
+class ChatCompletionRequest(BaseModel):
+    model: str
+    messages: List[ChatMessage]
+    temperature: Optional[float] = 1.0
+    top_p: Optional[float] = 1.0
+    n: Optional[int] = 1
+    stream: Optional[bool] = False
+    stop: Optional[Union[str, List[str]]] = None
+    max_tokens: Optional[int] = None
+    presence_penalty: Optional[float] = 0.0
+    frequency_penalty: Optional[float] = 0.0
+    logit_bias: Optional[Dict[str, float]] = None
+    user: Optional[str] = None
+
+class ChatCompletionChoice(BaseModel):
+    index: int
+    message: ChatMessage
+    finish_reason: Optional[str] = None
+
+class ChatCompletionResponse(BaseModel):
+    id: str
+    object: str = "chat.completion"
+    created: int = Field(default_factory=lambda: int(time.time()))
+    model: str
+    choices: List[ChatCompletionChoice]
+    usage: Optional[Dict[str, int]] = None
+
+class ChatCompletionChunkDelta(BaseModel):
+    role: Optional[str] = None
+    content: Optional[str] = None
+
+class ChatCompletionChunkChoice(BaseModel):
+    index: int
+    delta: ChatCompletionChunkDelta
+    finish_reason: Optional[str] = None
+
+class ChatCompletionChunk(BaseModel):
+    id: str
+    object: str = "chat.completion.chunk"
+    created: int = Field(default_factory=lambda: int(time.time()))
+    model: str
+    choices: List[ChatCompletionChunkChoice]
+
+class CompletionRequest(BaseModel):
+    model: str
+    prompt: Union[str, List[str]]
+    suffix: Optional[str] = None
+    max_tokens: Optional[int] = 16
+    temperature: Optional[float] = 1.0
+    top_p: Optional[float] = 1.0
+    n: Optional[int] = 1
+    stream: Optional[bool] = False
+    logprobs: Optional[int] = None
+    echo: Optional[bool] = False
+    stop: Optional[Union[str, List[str]]] = None
+    presence_penalty: Optional[float] = 0.0
+    frequency_penalty: Optional[float] = 0.0
+    best_of: Optional[int] = 1
+    logit_bias: Optional[Dict[str, float]] = None
+    user: Optional[str] = None
+
+class CompletionChoice(BaseModel):
+    text: str
+    index: int
+    logprobs: Optional[Any] = None
+    finish_reason: Optional[str] = None
+
+class CompletionResponse(BaseModel):
+    id: str
+    object: str = "text_completion"
+    created: int = Field(default_factory=lambda: int(time.time()))
+    model: str
+    choices: List[CompletionChoice]
+    usage: Optional[Dict[str, int]] = None
+
+class ModelCard(BaseModel):
+    id: str
+    object: str = "model"
+    created: int = Field(default_factory=lambda: int(time.time()))
+    owned_by: str = "aetheris"
+
+class ModelList(BaseModel):
+    object: str = "list"
+    data: List[ModelCard]