Ultra-Performance AI Dispatch Gateway for macOS (v1.0.0-beta)
Seamlessly proxy Gemini & Claude. OpenAI-Compatible. Privacy First.
Features β’ GUI Overview β’ Architecture β’ Installation β’ Integration
EllProxy is a next-generation native macOS menu bar application designed for developers and AI enthusiasts. It perfectly combines multi-account management, protocol conversion, and smart request scheduling to provide you with a stable, high-speed, and low-cost Local AI Relay Station.
By leveraging this app, you can transform common AI subscriptions (Claude, Gemini, etc.) into standardized API interfaces, enabling you to use powerful tools like Factory Droids, AmpCode, and Trae without purchasing separate API credits.
Note
Forked from VibeProxy v1.8.23 β Enhanced with modular architecture, advanced model management, and automated release workflows.
- Global Status Monitoring: Instant insight into server health, active port, and connection status directly from your menu bar.
- One-Click Control: Toggle "Smart Routing" and "Thinking Mode" instantly without digging through menus.
- Non-intrusive Design: Runs silently in the background with a minimal memory footprint, optimized for Apple Silicon (M1-M4).
- Unified Auth System: Supports Google (Gemini), Anthropic (Claude), OpenAI (ChatGPT), Qwen, and Antigravity accounts.
- Multi-Account Round-Robin: Automatically rotates between multiple accounts for the same provider to maximize rate limits.
- Secure Storage: All credentials are encrypted and stored safely in the macOS Keychain.
- OpenAI Compatible: Provides a standardized
/v1/chat/completionsendpoint compatible with 99% of AI tools (VS Code extensions, terminal agents). - Anthropic Compatible: Fully supports the new
thinkingcapability in Claude Code CLI, enabling extended reasoning models likeclaude-4-5-sonnet-thinking. - Coding Agent Ready: Dedicated support for Factory Droids and Amp CLI, transforming them into infinite-context coding machines.
- Fast Track vs Thinking Track: Automatically routes standard requests to fast models (e.g., Gemini Flash) and reasoning requests to powerful models (e.g., Claude Opus/Sonnet Thinking).
- Auto-Failover: Smartly detects failures and redirects requests to your configured backup models, ensuring your coding flow never stops.
- Model Sync: One-click discovery of all available models from your connected providers.
graph TD
Client([User Tools: Factory/Amp/VSCode]) -->|Standardized API Protocols| Gateway[EllProxy Server :8317]
Gateway --> Router[Smart Model Router]
Router -->|Thinking Request?| ThinkingEngine[Thinking Proxy Engine]
Router -->|Standard Request?| FastTrack[Fast Track Engine]
ThinkingEngine -->|Inject Thinking Params| UpstreamA[Upstream Provider A]
FastTrack -->|Round Robin| UpstreamB[Upstream Provider B]
UpstreamA --> ResponseMapper[Response Normalizer]
UpstreamB --> ResponseMapper
ResponseMapper --> Client
Download from GitHub Releases:
- Download
EllProxy.ziporEllProxy.dmg. - Extract and drag to
/Applications. - First Launch: Right-click β Open (to bypass gatekeeper for unsigned app).
git clone https://github.com/ellfarnaz/ellproxy.git
cd ellproxy
./create-app-bundle.shAdd to ~/.factory/config.json:
{
"custom_models": [
{
"model": "ellproxy-default",
"base_url": "http://localhost:8317/v1",
"api_key": "dummy",
"provider": "openai"
}
]
}Configure Amp to use local proxy:
# Set Amp to use EllProxy
amp config set url http://localhost:8317from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8317/v1",
api_key="dummy-key"
)
response = client.chat.completions.create(
model="gemini-2.0-flash",
messages=[{"role": "user", "content": "Hello via EllProxy!"}]
)
print(response.choices[0].message.content)- License: MIT. Open source and free.
- Privacy: All data runs locally. No data collection.
If you find this tool helpful, please give it a βοΈ on GitHub!
Copyright Β© 2025 EllProxy Team.



