⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF + SafeTensors, hot model swap, auto-discovery, single binary. FREE now, FREE forever.
-
Updated
Jan 16, 2026 - Rust
⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF + SafeTensors, hot model swap, auto-discovery, single binary. FREE now, FREE forever.
AI Gateway: Claude Pro, Copilot, Gemini subscriptions → OpenAI/Anthropic/Gemini APIs. No API keys needed.
A high-performance API server that provides OpenAI-compatible endpoints for MLX models. Developed using Python and powered by the FastAPI framework, it provides an efficient, scalable, and user-friendly solution for running MLX-based vision and language models locally with an OpenAI-compatible interface.
将 Z.ai Chat 代理为 OpenAI/Anthropic Compatible 格式,支持多模型列表映射、免令牌、智能处理思考链、图片上传等功能;Z.ai ZtoApi z2api ZaitoApi zai X-Signature 签名 GLM 4.5 v 4.6
OpenClaw alternative in your pocket
Home Assistant LLM integration for local OpenAI-compatible services (llamacpp, vllm, etc)
duck.ai openai compatible api server
High-performance Ollama proxy with per-user fair-share queuing, round-robin scheduling, and a real-time TUI dashboard. Built in Rust.
Unified LLM API client library for Python. Simple API for Chat, Embedding, Rerank, and Tokenizer. OpenAI-compatible with streaming support and unified usage tracking.
A high-performance, self-hosted Deno proxy that makes Fal.ai's powerful models (Flux, SDXL, etc.) compatible with the standard OpenAI image generation API. Use any OpenAI client seamlessly.
Ollama Client – Chat with Local LLMs Inside Your Browser A lightweight, privacy‑first Chrome extension to chat with local LLMs via Ollama, LM Studio, and llama.cpp. Supports streaming, stop/regenerate, RAG, and easy model switching — all without cloud APIs or data leaks.
Complete guide to convert GitHub Copilot into an OpenAI-compatible API
Expose agent CLIs (Codex, Cursor Agent, Claude Code, Gemini) as an OpenAI-compatible /v1 API gateway.
让 GLM-4.5 完美适配 Agent TARS 系统的高性能适配器 - 解决 toolcall 兼容性,提供智能 fallback,极低成本享受顶级 AI Agent 体验
A Docker-based OpenAI-compatible Text-to-Speech API server powered by Kyutai's TTS models with GPU acceleration support.
Gemini API: Rotate keys, break limits.
🚀 Intelligent Ollama API proxy pool based on Cloudflare Workers - 基于 Cloudflare Workers 的智能 Ollama API 代理池,支持多账号轮询、自动故障转移、负载均衡和统一鉴权
AI Amnesia solved. 5-layer persistent memory for local AI assistants. Built by a non-coder running a live business. 353 sessions. 8 months. One nuclear reset. Still running.
Define AI agent roles in YAML and run them anywhere: CLI, API server, or autonomous daemon
OpenAI-compatible AI proxy: Anthropic Claude, Google Gemini, GPT-5, Cloudflare AI. Free hosting, automatic failover, token rotation. Deploy in 1 minute.
Add a description, image, and links to the openai-compatible topic page so that developers can more easily learn about it.
To associate your repository with the openai-compatible topic, visit your repo's landing page and select "manage topics."