Skip to content

ivanarifin/gemini-cli-proxy

 
 

Repository files navigation

Gemini CodeAssist Proxy

This local server provides OpenAI (/openai) and Anthropic (/anthropic) compatible endpoints through Gemini CodeAssist (Gemini CLI).

  • If you have used Gemini CLI before, it will utilize existing Gemini CLI credentials.
  • If you have NOT used Gemini CLI before, you will be prompted to log in to Gemini CLI App through browser.

But why?

Gemini CodeAssist (Gemini CLI) offers a generous free tier. As of 2025-09-01, free tier offers 60 requests/min and 1,000 requests/day.

Gemini CodeAssist does not provide direct access to Gemini models which limits your choice to highly rated CodeAssist plugins

Quick Start

yarn dev

The server will start on http://localhost:3000

  • OpenAI compatible endpoint: http://localhost:3000/openai
  • Anthropic compatible endpoint: http://localhost:3000/anthropic

Usage

yarn dev [options]

Options:

  • -p, --port <port> - Server port (default: 3000)
  • -g, --google-cloud-project <project> - Google Cloud project ID if you have paid/enterprise tier (default: GOOGLE_CLOUD_PROJECT env variable)
  • --disable-browser-auth - Disables browser auth flow and uses code based auth (default: false)
  • --enable-google-search - Enables native Google Search tool (default: false)
  • --disable-auto-model-switch - Disables auto model switching in case of rate limiting (default: false)
  • --oauth-rotation-paths <paths> - Comma-separated paths to OAuth credential files for automatic rotation on rate limits (default: disabled)
  • --oauth-rotation-folder <folder> - Path to folder containing OAuth credential files for automatic rotation (default: disabled)

If you have NOT used Gemini CLI before, you will be prompted to log in to Gemini CLI App through browser. Credentials will be saved in the folder (~/.gemini/oauth_creds.json) used by Gemini CLI.

Supported Models

The following Gemini models are supported:

  • auto - Enables automatic model switching (starts with gemini-3-pro-preview, downgrades on rate limits)
  • gemini-2.5-pro - Previous generation Pro model
  • gemini-2.5-flash - Faster, lighter model
  • gemini-3-pro-preview - Latest Gemini 3 Pro model (preview, default for "auto")
  • gemini-3-flash-preview - Latest Gemini 3 Flash model (preview)

gemini-3-pro-preview is the default model when you request "auto" or when no model is specified.

Intelligent Model Passthrough

When you specify a specific model in your API request (e.g., gemini-3-pro-preview), the proxy will use that exact model without applying automatic downgrade/fallback logic.

Auto-switching (Pro → Flash) only occurs when:

  • The requested model is "auto", null, or missing from the request
  • The model hits rate limits and a fallback is available

This ensures that when you explicitly request a model, you get that model.

Example:

# Request auto-switching (recommended for most use cases)
curl http://localhost:3000/openai/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "auto", "messages": [{"role": "user", "content": "Hello!"}]}'

# Request specific model (bypasses auto-switching)
curl http://localhost:3000/openai/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "gemini-3-flash-preview", "messages": [{"role": "user", "content": "Hello!"}]}'

Use with -insert-your-favorite-agentic-tool-here-

Most agentic tools rely on environment variables, you can export the following variables

export OPENAI_API_BASE=http://localhost:3000/openai
export OPENAI_API_KEY=ItDoesNotMatter
export ANTHROPIC_BASE_URL="http://localhost:3000/anthropic"
export ANTHROPIC_AUTH_TOKEN=ItDoesNotMatter

Use with Claude Code

Add the following env fields to .claude/settings.json file

{
  "permissions": {
    ...
  },
  "env": {
    "ANTHROPIC_BASE_URL": "http://localhost:3000/anthropic",
    "ANTHROPIC_AUTH_TOKEN": "NotImportant",
    "ANTHROPIC_MODEL": "gemini-2.5-pro"
  }
}

Use with Zed

Add the following to the Zed config file

{
  "language_models": {
    "openai": {
      "api_url": "http://localhost:3000/openai",
      "available_models": [
        {
          "name": "gemini-2.5",
          "display_name": "localhost:gemini-2.5",
          "max_tokens": 131072
        }
      ]
    }
  }
}

OAuth Token Rotation

To enable automatic OAuth token rotation when rate limits (HTTP 429) are encountered:

Method 1: Using Auth CLI (Recommended)

The easiest way to manage multiple accounts is using the built-in auth CLI:

# Add a new account with a specific ID
yarn auth:add account1

# List all authenticated accounts and their status
yarn auth:list

# Check request counts for all accounts
yarn auth:counts

# Remove an account
yarn auth:remove account1

All accounts added via yarn auth:add are automatically stored in ~/.gemini/accounts/ and will be used for rotation if the server is started with --oauth-rotation-folder ~/.gemini/accounts.

Method 2: Using Individual File Paths

  1. Create multiple OAuth credential files by authenticating with different Google accounts
  2. Save each credential file (e.g., ~/.gemini/oauth_creds.json) to different locations
  3. Start the server with --oauth-rotation-paths option:
gemini-cli-proxy --oauth-rotation-paths "/path/to/acc1.json,/path/to/acc2.json,/path/to/acc3.json"

Method 3: Using a Folder

  1. Create a folder and save all OAuth credential files in it
  2. Start the server with --oauth-rotation-folder option:
# Start server with folder-based rotation
gemini-cli-proxy --oauth-rotation-folder "~/.gemini/accounts"

Benefits of rotation:

  • No need to restart when adding new accounts - just add JSON files to the folder
  • All accounts are automatically discovered and rotated
  • Easier to manage multiple accounts
  • No need to specify individual file paths

When a 429 error is detected:

  1. The proxy automatically rotates to the next account in the list (round-robin)
  2. The new credentials are copied to ~/.gemini/oauth_creds.json
  3. The failed request is automatically retried once with the new account
  4. A log message indicates which account is now active: [ROTATOR] Rate limit hit. Switched to account: <filename>

Note: OAuth rotation requires at least 2 credential files (or 2+ JSON files in a folder) to be effective.

Account Exhaustion Handling:

  • When all OAuth accounts have been exhausted, the rotator will throw an error
  • Error message: "All OAuth accounts have been exhausted. Please add new OAuth credential files to the rotation folder or restart the server to reset the exhaustion state."
  • To reset exhaustion state and continue rotation, add new OAuth credential files to the folder
  • The rotator will then continue cycling through all accounts
  • This prevents infinite rotation loops and ensures you're notified when accounts are exhausted

Development

Scripts

  • yarn dev - Start development server with hot reload
  • yarn build - Build TypeScript to JavaScript
  • yarn start - Start production server
  • yarn lint - Run ESLint
  • yarn auth:list - List all authenticated accounts
  • yarn auth:add <id> - Add a new account
  • yarn auth:remove <id> - Remove an account
  • yarn auth:counts - Check request counts for all accounts

Project Structure

src/
├── auth/           # Google authentication logic
├── gemini/         # Gemini API client and mapping
├── routes/         # Express route handlers
├── types/          # TypeScript type definitions
└── utils/          # Utility functions

About

Use Gemini CodeAssist (Gemini CLI) through the OpenAI/Anthropic API interface

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TypeScript 97.0%
  • JavaScript 3.0%