A lightweight, OpenAI-compatible API gateway written in Go that routes requests sequentially through configured providers until a successful response is received.
Born from Frustration: Created when Cloudflare AI Gateway unexpectedly started disconnecting users without explanation. This self-hosted alternative gives you full control with no vendor lock-in.
Daily Use Case: Connects to multiple AI providers with free tiers, automatically cycling between them when rate limits are hit - ensuring continuous service.
- Lightweight: ~20MB binary with minimal memory footprint
- Fast: Compiled Go with efficient runtime, no JVM overhead
- Reliable: Sequential provider fallback, automatic retry logic
- Simple: Single binary deployment, YAML configuration
- Secure: API key redaction, non-root execution, restrictive permissions
- Open-AI Compatible: Drop-in replacement for OpenAI API in tools like n8n. Just change the API base and make sure the requested model name matches one of your configured routes.
- Configure the gateway:
cp config.yaml.example config.yaml
# Edit config.yaml with your API keys
# See Configuration below- Deploy locally:
./install.sh build # Build binary
./install.sh install-service # Install as systemd service
sudo systemctl start ai-gateway # Start service- Or deploy remotely:
cp .env.example .env # Configure SSH credentials
# Edit .env with your server details or put them into the command string
SSH_HOST=your-server.com ./install.sh deploy- Local:
build→install-servicefor development/production on same machine - Remote (systemd):
deployhandles SSH upload, remote installation, and systemd service setup - Remote (Docker):
deploy-dockerbuilds and deploys as a container behind Traefik - Binary-only:
installfor basic binary installation without systemd service
For systemd deployments, use a reverse proxy like nginx or traefik to set up TLS termination and secure the traffic to your gateway.
Deploy as a container behind Traefik (or any reverse proxy) for HTTPS termination:
- Prerequisites: Docker on the remote server; Traefik with
traefik-publicnetwork - Configure: Ensure
config.yamland.envexist with your API keys (GATEWAY_API_KEY, provider keys) - Deploy:
cp .env.example .env # if needed # Edit .env with SSH_HOST, SSH_USER, DOMAIN, and runtime vars (GATEWAY_API_KEY, etc.) ./install.sh deploy-docker
- Domain: Set
DOMAINin.env(e.g.DOMAIN=ai-gateway.example.com). docker-compose uses it for the Traefik Host rule. - n8n integration: Set Base URL to
https://ai-gateway.redevest.ru/v1, Model to a route name (e.g.dynamic/n8n), API Key to yourGATEWAY_API_KEYvalue.
The gateway uses YAML configuration with environment variable substitution:
api_key: ${GATEWAY_API_KEY} # Gateway authentication key
port: 8080 # Optional, defaults to 8080
default_timeout: 300s # Default timeout for requests
providers:
- name: cerebras
api_key: ${CEREBRAS_API_KEY}
base_url: https://api.cerebras.ai/v1
- name: openrouter
api_key: ${OPENROUTER_API_KEY}
base_url: https://openrouter.ai/api/v1
routes:
- name: dynamic/n8n # Exact model name match required
steps:
- provider: cerebras
model: gpt-oss-120b
conflict_resolution: tools # Remove response_format if tools present
- provider: openrouter
model: nvidia/nemotron-3-nano-30b-a3b:freeYou can put your API keys into config.yaml directly, but for security purposes it's better to store them in env vars and use them in config.yaml.
Configuration Locations:
./config.yaml(current directory)/etc/ai-gateway/config.yaml(system location)
Environment Variables:
GATEWAY_API_KEY: Required for authentication- Provider API keys:
${PROVIDER_NAME}_API_KEY - Missing
${VAR}values cause startup errors with a clear list of missing vars
All endpoints except for /health require authentication.
Use X-Api-Key header or Authorization: Bearer <token> against configured gateway API key.
GET /healthReturns {"status": "healthy"} - no authentication required.
GET /v1/models
Headers: X-Api-Key: <gateway-api-key> OR Authorization: Bearer <token>Returns available route names from the configuration, which serve as the model names for requests.
POST /v1/chat/completions
Headers: X-Api-Key: <gateway-api-key> OR Authorization: Bearer <token>Routes requests to providers. Set model to the desired route name.
sudo systemctl start ai-gateway # Start service
sudo systemctl stop ai-gateway # Stop service
sudo systemctl enable ai-gateway # Enable auto-start
sudo systemctl status ai-gateway # Check status
sudo journalctl -u ai-gateway -f # View logs- Security: API key redaction, non-root execution, restrictive file permissions (600), TLS recommended
- Logging: Structured JSON logs with request/response summaries, automatic key redaction
- Error Handling: Sequential provider fallback on any error, detailed error messages with provider info
The gateway can send OpenTelemetry traces and logger events directly to any OTLP/HTTP-compliant collector. Configure the following environment variables to point at your observability backend (Grafana Cloud, Alloy, Tempo, or other OTLP destination):
OTLP_ENDPOINT: Full URL to the OTLP HTTP endpoint. Supports bothhost:portand full URLs likehttps://otlp-gateway.example.com/otlp.OTLP_API_KEY: API key or token.- For Grafana Cloud: You can use a standard
glc_Access Policy Token. The gateway automatically extracts the Instance ID from the token and handles the required Basic authentication (instanceID:apiKey). - For other collectors: It uses the provided key for Basic authentication (
apiKey:).
- For Grafana Cloud: You can use a standard
OTEL_SERVICE_NAME(orOTLP_SERVICE_NAME): Optional. The service name (ai-gateway) used to group spans/logs.OTEL_RESOURCE_ATTRIBUTES(orOTLP_RESOURCE_ATTRIBUTES): Optional. Comma-separatedkey=valuepairs added to each resource (e.g.,deployment.environment=production).OTLP_HEADERS(optional): Optional. Extra headers inKey=ValueCSV format.
The gateway uses the OTLP HTTP exporter for maximum compatibility (bypassing gRPC/ALPN issues). It automatically handles the /v1/traces signal path, ensuring that if you provide a base URL (like Grafana's /otlp), it still reaches the correct endpoint.
MIT