Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
d5a43fd
feat(providers): ✨ add antigravity provider and auth base
Mirrowel Nov 23, 2025
34cb9f8
feat(providers): ✨ add Gemini 3 thoughtSignature handling and reasoni…
Mirrowel Nov 23, 2025
7c758a6
feat(providers): add Antigravity file logging, reasoning mapping and …
Mirrowel Nov 23, 2025
1495325
feat(providers): ✨ support gemini 2.5/3 reasoning configs and custom …
Mirrowel Nov 23, 2025
ff82739
feat(providers): ✨ add server-side thoughtSignature cache and preserv…
Mirrowel Nov 23, 2025
065d589
fix(providers): πŸ› ensure only first parallel tool call retains though…
Mirrowel Nov 23, 2025
fc70523
feat(providers): ✨ add claude-sonnet-4-5 models and remove unnecessar…
Mirrowel Nov 23, 2025
97f1950
feat(auth): extract GoogleOAuthBase and add antigravity provider
Mirrowel Nov 23, 2025
77bfd5f
feat(antigravity): ✨ add dynamic model discovery toggle and get_valid…
Mirrowel Nov 23, 2025
c6478ed
fix(providers): πŸ› fix antigravity provider compatibility and async cr…
Mirrowel Nov 23, 2025
264959a
fix(antigravity): πŸ› convert tool parameters to parametersJsonSchema a…
Mirrowel Nov 23, 2025
4ff1edf
fix(providers): πŸ› normalize JSON Schema types, clean Claude tool sche…
Mirrowel Nov 23, 2025
0970b56
fix(antigravity): πŸ› add function call id fields and restrict thoughtS…
Mirrowel Nov 24, 2025
6adac7a
fix(api): πŸ› override global temperature=0 via OVERRIDE_TEMPERATURE_ZERO
Mirrowel Nov 24, 2025
d7fa998
feat(antigravity): ✨ add Gemini 3 tool-fix (namespace, signature, sys…
Mirrowel Nov 24, 2025
946e5a0
feat(antigravity): ✨ add disk persistence for thoughtSignature cache
Mirrowel Nov 24, 2025
08736cc
feat(antigravity): ✨ add Claude support and parse double-encoded JSON…
Mirrowel Nov 27, 2025
78eef96
feat(antigravity): ✨ add Claude thinking caching and generalize Antig…
Mirrowel Nov 27, 2025
0ff233d
refactor(gemini): πŸ”¨ implement official Gemini CLI discovery flow with…
Mirrowel Nov 27, 2025
afe6e70
refactor(antigravity): πŸ”¨ restructure provider with comprehensive code…
Mirrowel Nov 27, 2025
9bc26b9
refactor(providers): πŸ”¨ extract cache logic into shared ProviderCache …
Mirrowel Nov 27, 2025
e6a4ff2
refactor(antigravity): πŸ”¨ simplify Claude model variant handling with …
Mirrowel Nov 27, 2025
ae56762
feat(gemini): ✨ implement Gemini 3 support with tool fixes and signat…
Mirrowel Nov 27, 2025
3298177
refactor(gemini): πŸ”¨ remove redundant model and project fields from re…
Mirrowel Nov 27, 2025
868b7c9
refactor(logging): πŸ”¨ adjust logging levels and improve schema cleanin…
Mirrowel Nov 27, 2025
74f9532
feat(antigravity): ✨ add thinking mode toggling for mid-conversation …
Mirrowel Nov 27, 2025
0ea3b2d
fix(proxy): πŸ› prevent role field concatenation in streaming responses
Mirrowel Nov 27, 2025
4d4a198
fix(antigravity): πŸ› handle malformed double-encoded JSON responses
Mirrowel Nov 27, 2025
8d69bcd
fix(client): πŸ› prevent provider initialization without configured cre…
Mirrowel Nov 27, 2025
8a839ed
refactor(antigravity): πŸ”¨ remove thinking mode toggling feature
Mirrowel Nov 27, 2025
b5da45c
feat(client): ✨ add credential prioritization system for tier-based m…
Mirrowel Nov 27, 2025
f35e0e7
feat(rotation): ✨ add configurable weighted random credential selection
Mirrowel Nov 27, 2025
f5ccdf6
docs: πŸ“š add comprehensive documentation for new features and providers
Mirrowel Nov 27, 2025
7830a78
refactor(credential-tool): πŸ”¨ add export submenu for credential manage…
Mirrowel Nov 27, 2025
62e7cf3
One huge ass bugfix i can't even list here. It's a mess i'll fix later
Mirrowel Nov 27, 2025
d4593e5
fix(gemini): πŸ› consolidate parallel tool responses and improve rate l…
Mirrowel Nov 27, 2025
087aab7
feat(antigravity): ✨ add thinking mode sanitization for Claude API co…
Mirrowel Nov 27, 2025
474826e
chore(antigravity): 🧹 update User-Agent header to version 1.11.9
Mirrowel Nov 27, 2025
6c4ca7c
feat(antigravity): ✨ add default safety settings to prevent content f…
Mirrowel Nov 27, 2025
5bc49f2
feat(auth): ✨ add environment variable-based OAuth credential support…
Mirrowel Nov 27, 2025
d94742e
fix(auth): πŸ› add exponential backoff and validation for token refresh…
Mirrowel Nov 27, 2025
f6dce02
fix(providers): πŸ› improve finish_reason handling and tool_calls initi…
Mirrowel Nov 27, 2025
2384d86
fix(proxy): πŸ› load environment variables before displaying PROXY_API_KEY
Mirrowel Nov 27, 2025
64859d9
feat(settings): ✨ add provider-specific settings management UI
Mirrowel Nov 27, 2025
0dbcf50
chore(build): 🧹 remove Windows launcher script (not supposed to be th…
Mirrowel Nov 27, 2025
efbd008
docs(readme): πŸ“š improve Antigravity provider feature documentation
Mirrowel Nov 27, 2025
6573de3
chore(config): 🧹 ignore environment files and increase default token …
Mirrowel Nov 27, 2025
bd8f638
feat(credentials): ✨ add support for environment-based credential loa…
Mirrowel Nov 27, 2025
b6a47c9
feat(api): ✨ add model pricing and capabilities enrichment service
Mirrowel Nov 27, 2025
6ed1677
fix(provider): πŸ› improve Gemini 3 tool schema handling and parameter …
Mirrowel Nov 27, 2025
f50cbff
feat(provider): ✨ add strict JSON schema enforcement for Gemini 3 too…
Mirrowel Nov 27, 2025
5a03c26
fix(provider): πŸ› expand JSON schema validation keyword filtering and …
Mirrowel Nov 27, 2025
eb3864b
debugging pass to try to unfuck deployment
Mirrowel Nov 27, 2025
a140a0d
refactor(logging): πŸ”¨ remove debug print statements and add concise de…
Mirrowel Nov 27, 2025
29df294
fix(provider): πŸ› skip file operations for env:// credential paths
Mirrowel Nov 27, 2025
c264be0
refactor(api): πŸ”¨ change is_ready from method to property access
Mirrowel Nov 28, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,6 @@ coverage.xml
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
Expand Down Expand Up @@ -124,4 +123,8 @@ test_proxy.py
start_proxy.bat
key_usage.json
staged_changes.txt
launcher_config.json
cache/antigravity/thought_signatures.json
logs/
cache/
*.env
299 changes: 298 additions & 1 deletion DOCUMENTATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ client = RotatingClient(
- `whitelist_models` (`Optional[Dict[str, List[str]]]`, default: `None`): Whitelist of models to always include, overriding `ignore_models`.
- `enable_request_logging` (`bool`, default: `False`): If `True`, enables detailed per-request file logging.
- `max_concurrent_requests_per_key` (`Optional[Dict[str, int]]`, default: `None`): Max concurrent requests allowed for a single API key per provider.
- `rotation_tolerance` (`float`, default: `3.0`): Controls the credential rotation strategy. See Section 2.2 for details.

#### Core Responsibilities

Expand Down Expand Up @@ -110,8 +111,16 @@ The `acquire_key` method uses a sophisticated strategy to balance load:
2. **Tiering**: Valid keys are split into two tiers:
* **Tier 1 (Ideal)**: Keys that are completely idle (0 concurrent requests).
* **Tier 2 (Acceptable)**: Keys that are busy but still under their configured `MAX_CONCURRENT_REQUESTS_PER_KEY_<PROVIDER>` limit for the requested model. This allows a single key to be used multiple times for the same model, maximizing throughput.
3. **Prioritization**: Within each tier, keys with the **lowest daily usage** are prioritized to spread costs evenly.
3. **Selection Strategy** (configurable via `rotation_tolerance`):
* **Deterministic (tolerance=0.0)**: Within each tier, keys are sorted by daily usage count and the least-used key is always selected. This provides perfect load balance but predictable patterns.
* **Weighted Random (tolerance>0, default)**: Keys are selected randomly with weights biased toward less-used ones:
- Formula: `weight = (max_usage - credential_usage) + tolerance + 1`
- `tolerance=2.0` (recommended): Balanced randomness - credentials within 2 uses of the maximum can still be selected with reasonable probability
- `tolerance=5.0+`: High randomness - even heavily-used credentials have significant probability
- **Security Benefit**: Unpredictable selection patterns make rate limit detection and fingerprinting harder
- **Load Balance**: Lower-usage credentials still preferred, maintaining reasonable distribution
4. **Concurrency Limits**: Checks against `max_concurrent` limits to prevent overloading a single key.
5. **Priority Groups**: When credential prioritization is enabled, higher-tier credentials (lower priority numbers) are tried first before moving to lower tiers.

#### Failure Handling & Cooldowns

Expand Down Expand Up @@ -313,6 +322,294 @@ The `CooldownManager` handles IP or account-level rate limiting that affects all
- If so, `CooldownManager.start_cooldown()` is called for the entire provider
- All subsequent `acquire_key()` calls for that provider will wait until the cooldown expires


### 2.10. Credential Prioritization System (`client.py` & `usage_manager.py`)

The library now includes an intelligent credential prioritization system that automatically detects credential tiers and ensures optimal credential selection for each request.

**Key Concepts:**

- **Provider-Level Priorities**: Providers can implement `get_credential_priority()` to return a priority level (1=highest, 10=lowest) for each credential
- **Model-Level Requirements**: Providers can implement `get_model_tier_requirement()` to specify minimum priority required for specific models
- **Automatic Filtering**: The client automatically filters out incompatible credentials before making requests
- **Priority-Aware Selection**: The `UsageManager` prioritizes higher-tier credentials (lower numbers) within the same priority group

**Implementation Example (Gemini CLI):**

```python
def get_credential_priority(self, credential: str) -> Optional[int]:
"""Returns priority based on Gemini tier."""
tier = self.project_tier_cache.get(credential)
if not tier:
return None # Not yet discovered

# Paid tiers get highest priority
if tier not in ['free-tier', 'legacy-tier', 'unknown']:
return 1

# Free tier gets lower priority
if tier == 'free-tier':
return 2

return 10

def get_model_tier_requirement(self, model: str) -> Optional[int]:
"""Returns minimum priority required for model."""
if model.startswith("gemini-3-"):
return 1 # Only paid tier (priority 1) credentials

return None # All other models have no restrictions
```

**Usage Manager Integration:**

The `acquire_key()` method has been enhanced to:
1. Group credentials by priority level
2. Try highest priority group first (priority 1, then 2, etc.)
3. Within each group, use existing tier1/tier2 logic (idle keys first, then busy keys)
4. Load balance within priority groups by usage count
5. Only move to next priority if all higher-priority credentials are exhausted

**Benefits:**

- Ensures paid-tier credentials are always used for premium models
- Prevents failed requests due to tier restrictions
- Optimal cost distribution (free tier used when possible, paid when required)
- Graceful fallback if primary credentials are unavailable

---

### 2.11. Provider Cache System (`providers/provider_cache.py`)

A modular, shared caching system for providers to persist conversation state across requests.

**Architecture:**

- **Dual-TTL Design**: Short-lived memory cache (default: 1 hour) + longer-lived disk persistence (default: 24 hours)
- **Background Persistence**: Batched disk writes every 60 seconds (configurable)
- **Automatic Cleanup**: Background task removes expired entries from memory cache

### 3.5. Antigravity (`antigravity_provider.py`)

The most sophisticated provider implementation, supporting Google's internal Antigravity API for Gemini and Claude models.

#### Architecture

- **Unified Streaming/Non-Streaming**: Single code path handles both response types with optimal transformations
- **Thought Signature Caching**: Server-side caching of encrypted signatures for multi-turn Gemini 3 conversations
- **Model-Specific Logic**: Automatic configuration based on model type (Gemini 2.5, Gemini 3, Claude)

#### Model Support

**Gemini 2.5 (Pro/Flash):**
- Uses `thinkingBudget` parameter (integer tokens: -1 for auto, 0 to disable, or specific value)
- Standard safety settings and toolConfig
- Stream processing with thinking content separation

**Gemini 3 (Pro/Image):**
- Uses `thinkingLevel` parameter (string: "low" or "high")
- **Tool Hallucination Prevention**:
- Automatic system instruction injection explaining custom tool schema rules
- Parameter signature injection into tool descriptions (e.g., "STRICT PARAMETERS: files (ARRAY_OF_OBJECTS[path: string REQUIRED, ...])")
- Namespace prefix for tool names (`gemini3_` prefix) to avoid training data conflicts
- Malformed JSON auto-correction (handles extra trailing braces)
- **ThoughtSignature Management**:
- Caching signatures from responses for reuse in follow-up messages
- Automatic injection into functionCalls for multi-turn conversations
- Fallback to bypass value if signature unavailable

**Claude Sonnet 4.5:**
- Proxied through Antigravity API (uses internal model name `claude-sonnet-4-5-thinking`)
- Uses `thinkingBudget` parameter like Gemini 2.5
- **Thinking Preservation**: Caches thinking content using composite keys (tool_call_id + text_hash)
- **Schema Cleaning**: Removes unsupported properties (`$schema`, `additionalProperties`, `const` β†’ `enum`)

#### Base URL Fallback

Automatic fallback chain for resilience:
1. `daily-cloudcode-pa.sandbox.googleapis.com` (primary sandbox)
2. `autopush-cloudcode-pa.sandbox.googleapis.com` (fallback sandbox)
3. `cloudcode-pa.googleapis.com` (production fallback)

#### Message Transformation

**OpenAI β†’ Gemini Format:**
- System messages β†’ `systemInstruction` with parts array
- Multi-part content (text + images) β†’ `inlineData` format
- Tool calls β†’ `functionCall` with args and id
- Tool responses β†’ `functionResponse` with name and response
- ThoughtSignatures preserved/injected as needed

**Tool Response Grouping:**
- Converts linear format (call, response, call, response) to grouped format
- Groups all function calls in one `model` message
- Groups all responses in one `user` message
- Required for Antigravity API compatibility

#### Configuration (Environment Variables)

```env
# Cache control
ANTIGRAVITY_SIGNATURE_CACHE_TTL=3600 # Memory cache TTL
ANTIGRAVITY_SIGNATURE_DISK_TTL=86400 # Disk cache TTL
ANTIGRAVITY_ENABLE_SIGNATURE_CACHE=true

# Feature flags
ANTIGRAVITY_PRESERVE_THOUGHT_SIGNATURES=true # Include signatures in client responses
ANTIGRAVITY_ENABLE_DYNAMIC_MODELS=false # Use API model discovery
ANTIGRAVITY_GEMINI3_TOOL_FIX=true # Enable Gemini 3 hallucination prevention
ANTIGRAVITY_CLAUDE_THINKING_SANITIZATION=true # Enable Claude thinking mode auto-correction

# Gemini 3 tool fix customization
ANTIGRAVITY_GEMINI3_TOOL_PREFIX="gemini3_" # Namespace prefix
ANTIGRAVITY_GEMINI3_DESCRIPTION_PROMPT="\n\nSTRICT PARAMETERS: {params}."
ANTIGRAVITY_GEMINI3_SYSTEM_INSTRUCTION="..." # Full system prompt
```

#### Claude Extended Thinking Sanitization

The provider includes automatic sanitization for Claude's extended thinking mode, handling common error scenarios:

**Problem**: Claude's extended thinking API requires strict consistency in thinking blocks:
- If thinking is enabled, the final assistant turn must start with a thinking block
- If thinking is disabled, no thinking blocks can be present in the final turn
- Tool use loops are part of a single "assistant turn"
- You **cannot** toggle thinking mode mid-turn (this is invalid per Claude API)

**Scenarios Handled**:

| Scenario | Action |
|----------|--------|
| Tool loop WITH thinking + thinking enabled | Preserve thinking, continue normally |
| Tool loop WITHOUT thinking + thinking enabled | **Inject synthetic closure** to start fresh turn with thinking |
| Thinking disabled | Strip all thinking blocks |
| Normal conversation (no tool loop) | Strip old thinking, new response adds thinking naturally |

**Solution**: The `_sanitize_thinking_for_claude()` method:
- Analyzes conversation state to detect incomplete tool use loops
- When enabling thinking in a tool loop that started without thinking:
- Injects a minimal synthetic assistant message: `"[Tool execution completed. Processing results.]"`
- This **closes** the previous turn, allowing Claude to start a **fresh turn with thinking**
- Strips thinking from old turns (Claude API ignores them anyway)
- Preserves thinking when the turn was started with thinking enabled

**Key Insight**: Instead of force-disabling thinking, we close the tool loop with a synthetic message. This allows seamless model switching (e.g., Gemini β†’ Claude with thinking) without losing the ability to think.

**Example**:
```
Before sanitization:
User: "What's the weather?"
Assistant: [tool_use: get_weather] ← Made by Gemini (no thinking)
User: [tool_result: "20C sunny"]

After sanitization (thinking enabled):
User: "What's the weather?"
Assistant: [tool_use: get_weather]
User: [tool_result: "20C sunny"]
Assistant: "[Tool execution completed. Processing results.]" ← INJECTED

β†’ Claude now starts a NEW turn and CAN think!
```

**Configuration**:
```env
ANTIGRAVITY_CLAUDE_THINKING_SANITIZATION=true # Enable/disable auto-correction
```

#### File Logging

Optional transaction logging for debugging:
- Enabled via `enable_request_logging` parameter
- Creates `logs/antigravity_logs/TIMESTAMP_MODEL_UUID/` directory per request
- Logs: `request_payload.json`, `response_stream.log`, `final_response.json`, `error.log`

---


- **Atomic Disk Writes**: Uses temp-file-and-move pattern to prevent corruption

**Key Methods:**

1. **`store(key, value)`**: Synchronously queues value for storage (schedules async write)
2. **`retrieve(key)`**: Synchronously retrieves from memory, optionally schedules disk fallback
3. **`store_async(key, value)`**: Awaitable storage for guaranteed persistence
4. **`retrieve_async(key)`**: Awaitable retrieval with disk fallback

**Use Cases:**

- **Gemini 3 ThoughtSignatures**: Caching tool call signatures for multi-turn conversations
- **Claude Thinking**: Preserving thinking content for consistency across conversation turns
- **Any Transient State**: Generic key-value storage for provider-specific needs

**Configuration (Environment Variables):**

```env
# Cache control (prefix can be customized per cache instance)
PROVIDER_CACHE_ENABLE=true
PROVIDER_CACHE_WRITE_INTERVAL=60 # seconds between disk writes
PROVIDER_CACHE_CLEANUP_INTERVAL=1800 # 30 min between cleanups

# Gemini 3 specific
GEMINI_CLI_SIGNATURE_CACHE_ENABLE=true
GEMINI_CLI_SIGNATURE_CACHE_TTL=3600 # 1 hour memory TTL
GEMINI_CLI_SIGNATURE_DISK_TTL=86400 # 24 hours disk TTL
```

**File Structure:**

```
cache/
β”œβ”€β”€ gemini_cli/
β”‚ └── gemini3_signatures.json
└── antigravity/
β”œβ”€β”€ gemini3_signatures.json
└── claude_thinking.json
```

---

### 2.12. Google OAuth Base (`providers/google_oauth_base.py`)

A refactored, reusable OAuth2 base class that eliminates code duplication across Google-based providers.

**Refactoring Benefits:**

- **Single Source of Truth**: All OAuth logic centralized in one class
- **Easy Provider Addition**: New providers only need to override constants
- **Consistent Behavior**: Token refresh, expiry handling, and validation work identically across providers
- **Maintainability**: OAuth bugs fixed once apply to all inheriting providers

**Provider Implementation:**

```python
class AntigravityAuthBase(GoogleOAuthBase):
# Required overrides
CLIENT_ID = "antigravity-client-id"
CLIENT_SECRET = "antigravity-secret"
OAUTH_SCOPES = [
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/cclog", # Antigravity-specific
"https://www.googleapis.com/auth/experimentsandconfigs",
]
ENV_PREFIX = "ANTIGRAVITY" # Used for env var loading

# Optional overrides (defaults provided)
CALLBACK_PORT = 51121
CALLBACK_PATH = "/oauthcallback"
```

**Inherited Features:**

- Automatic token refresh with exponential backoff
- Invalid grant re-authentication flow
- Stateless deployment support (env var loading)
- Atomic credential file writes
- Headless environment detection
- Sequential refresh queue processing

---


---

## 3. Provider Specific Implementations
Expand Down
31 changes: 31 additions & 0 deletions Deployment guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,37 @@ If you are using providers that require complex OAuth files (like **Gemini CLI**
4. Copy the contents of this file and paste them directly into your `.env` file or Render's "Environment Variables" section.
5. The proxy will automatically detect and use these variablesβ€”no file upload required!


### Advanced: Antigravity OAuth Provider

The Antigravity provider requires OAuth2 authentication similar to Gemini CLI. It provides access to:
- Gemini 2.5 models (Pro/Flash)
- Gemini 3 models (Pro/Image-preview) - **requires paid-tier Google Cloud project**
- Claude Sonnet 4.5 via Google's Antigravity proxy

**Setting up Antigravity locally:**
1. Run the credential tool: `python -m rotator_library.credential_tool`
2. Select "Add OAuth Credential" and choose "Antigravity"
3. Complete the OAuth flow in your browser
4. The credential is saved to `oauth_creds/antigravity_oauth_1.json`

**Exporting for stateless deployment:**
1. Run: `python -m rotator_library.credential_tool`
2. Select "Export Antigravity to .env"
3. Copy the generated environment variables to your deployment platform:
```env
ANTIGRAVITY_ACCESS_TOKEN="..."
ANTIGRAVITY_REFRESH_TOKEN="..."
ANTIGRAVITY_EXPIRY_DATE="..."
ANTIGRAVITY_EMAIL="your-email@gmail.com"
```

**Important Notes:**
- Antigravity uses Google OAuth with additional scopes for cloud platform access
- Gemini 3 models require a paid-tier Google Cloud project (free tier will fail)
- The provider automatically handles thought signature caching for multi-turn conversations
- Tool hallucination prevention is enabled by default for Gemini 3 models

4. Save the file. (We'll upload it to Render in Step 5.)


Expand Down
Loading
Loading