This local server provides OpenAI (/openai) and Anthropic (/anthropic) compatible endpoints through Gemini CodeAssist (Gemini CLI).
- If you have used Gemini CLI before, it will utilize existing Gemini CLI credentials.
- If you have NOT used Gemini CLI before, you will be prompted to log in to Gemini CLI App through browser.
Gemini CodeAssist (Gemini CLI) offers a generous free tier. As of 2025-09-01, free tier offers 60 requests/min and 1,000 requests/day.
Gemini CodeAssist does not provide direct access to Gemini models which limits your choice to highly rated CodeAssist plugins
yarn dev
The server will start on http://localhost:3000
- OpenAI compatible endpoint:
http://localhost:3000/openai - Anthropic compatible endpoint:
http://localhost:3000/anthropic
yarn dev [options]Options:
-p, --port <port>- Server port (default: 3000)-g, --google-cloud-project <project>- Google Cloud project ID if you have paid/enterprise tier (default: GOOGLE_CLOUD_PROJECT env variable)--disable-browser-auth- Disables browser auth flow and uses code based auth (default: false)--enable-google-search- Enables native Google Search tool (default: false)--disable-auto-model-switch- Disables auto model switching in case of rate limiting (default: false)--oauth-rotation-paths <paths>- Comma-separated paths to OAuth credential files for automatic rotation on rate limits (default: disabled)--oauth-rotation-folder <folder>- Path to folder containing OAuth credential files for automatic rotation (default: disabled)
If you have NOT used Gemini CLI before, you will be prompted to log in to Gemini CLI App through browser. Credentials will be saved in the folder (~/.gemini/oauth_creds.json) used by Gemini CLI.
The following Gemini models are supported:
auto- Enables automatic model switching (starts with gemini-3-pro-preview, downgrades on rate limits)gemini-2.5-pro- Previous generation Pro modelgemini-2.5-flash- Faster, lighter modelgemini-3-pro-preview- Latest Gemini 3 Pro model (preview, default for "auto")gemini-3-flash-preview- Latest Gemini 3 Flash model (preview)
gemini-3-pro-preview is the default model when you request "auto" or when no model is specified.
When you specify a specific model in your API request (e.g., gemini-3-pro-preview), the proxy will use that exact model without applying automatic downgrade/fallback logic.
Auto-switching (Pro → Flash) only occurs when:
- The requested model is
"auto",null, or missing from the request - The model hits rate limits and a fallback is available
This ensures that when you explicitly request a model, you get that model.
Example:
# Request auto-switching (recommended for most use cases)
curl http://localhost:3000/openai/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "auto", "messages": [{"role": "user", "content": "Hello!"}]}'
# Request specific model (bypasses auto-switching)
curl http://localhost:3000/openai/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "gemini-3-flash-preview", "messages": [{"role": "user", "content": "Hello!"}]}'Most agentic tools rely on environment variables, you can export the following variables
export OPENAI_API_BASE=http://localhost:3000/openai
export OPENAI_API_KEY=ItDoesNotMatter
export ANTHROPIC_BASE_URL="http://localhost:3000/anthropic"
export ANTHROPIC_AUTH_TOKEN=ItDoesNotMatter
Add the following env fields to .claude/settings.json file
{
"permissions": {
...
},
"env": {
"ANTHROPIC_BASE_URL": "http://localhost:3000/anthropic",
"ANTHROPIC_AUTH_TOKEN": "NotImportant",
"ANTHROPIC_MODEL": "gemini-2.5-pro"
}
}Add the following to the Zed config file
{
"language_models": {
"openai": {
"api_url": "http://localhost:3000/openai",
"available_models": [
{
"name": "gemini-2.5",
"display_name": "localhost:gemini-2.5",
"max_tokens": 131072
}
]
}
}
}To enable automatic OAuth token rotation when rate limits (HTTP 429) are encountered:
The easiest way to manage multiple accounts is using the built-in auth CLI:
# Add a new account with a specific ID
yarn auth:add account1
# List all authenticated accounts and their status
yarn auth:list
# Check request counts for all accounts
yarn auth:counts
# Remove an account
yarn auth:remove account1All accounts added via yarn auth:add are automatically stored in ~/.gemini/accounts/ and will be used for rotation if the server is started with --oauth-rotation-folder ~/.gemini/accounts.
- Create multiple OAuth credential files by authenticating with different Google accounts
- Save each credential file (e.g.,
~/.gemini/oauth_creds.json) to different locations - Start the server with
--oauth-rotation-pathsoption:
gemini-cli-proxy --oauth-rotation-paths "/path/to/acc1.json,/path/to/acc2.json,/path/to/acc3.json"- Create a folder and save all OAuth credential files in it
- Start the server with
--oauth-rotation-folderoption:
# Start server with folder-based rotation
gemini-cli-proxy --oauth-rotation-folder "~/.gemini/accounts"Benefits of rotation:
- No need to restart when adding new accounts - just add JSON files to the folder
- All accounts are automatically discovered and rotated
- Easier to manage multiple accounts
- No need to specify individual file paths
When a 429 error is detected:
- The proxy automatically rotates to the next account in the list (round-robin)
- The new credentials are copied to
~/.gemini/oauth_creds.json - The failed request is automatically retried once with the new account
- A log message indicates which account is now active:
[ROTATOR] Rate limit hit. Switched to account: <filename>
Note: OAuth rotation requires at least 2 credential files (or 2+ JSON files in a folder) to be effective.
Account Exhaustion Handling:
- When all OAuth accounts have been exhausted, the rotator will throw an error
- Error message: "All OAuth accounts have been exhausted. Please add new OAuth credential files to the rotation folder or restart the server to reset the exhaustion state."
- To reset exhaustion state and continue rotation, add new OAuth credential files to the folder
- The rotator will then continue cycling through all accounts
- This prevents infinite rotation loops and ensures you're notified when accounts are exhausted
yarn dev- Start development server with hot reloadyarn build- Build TypeScript to JavaScriptyarn start- Start production serveryarn lint- Run ESLintyarn auth:list- List all authenticated accountsyarn auth:add <id>- Add a new accountyarn auth:remove <id>- Remove an accountyarn auth:counts- Check request counts for all accounts
src/
├── auth/ # Google authentication logic
├── gemini/ # Gemini API client and mapping
├── routes/ # Express route handlers
├── types/ # TypeScript type definitions
└── utils/ # Utility functions