Skip to content

matdev83/llm-interactive-proxy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM Interactive Proxy

CI Architecture Check Coverage Python License

A swiss-army knife proxy for LLM-powered applications. Sits between any LLM-aware client and any backend, presenting multiple front-end APIs (OpenAI, Anthropic, Gemini) while routing to your chosen provider. Translate requests, override models, rotate API keys, prevent leaks, inspect traffic, and execute chat-embedded commands—all from a single drop-in gateway.

Architecture

graph TD
    subgraph "Clients"
        A[OpenAI Client]
        B[Anthropic Client]
        C[Gemini Client]
        D[Any LLM App]
    end

    subgraph "LLM Interactive Proxy"
        FE["Front-end APIs<br/>(OpenAI, Anthropic, Gemini)"]
        Core["Core Proxy Logic<br/>(Routing, Translation, Safety)"]
        BE["Back-end Connectors<br/>(OpenAI, Anthropic, Gemini, etc.)"]
        FE --> Core --> BE
    end

    subgraph "Providers"
        P1[OpenAI API]
        P2[Anthropic API]
        P3[Google Gemini API]
        P4[OpenRouter API]
    end

    A --> FE
    B --> FE
    C --> FE
    D --> FE
    BE --> P1
    BE --> P2
    BE --> P3
    BE --> P4
Loading

Key Features

  • Connect Any App to Any Model: Route requests from any LLM client to any backend, even across protocols
  • Codebuff WebSocket Server: Real-time AI communication via WebSocket with session management, streaming responses, and file context support - Quick Start
  • Model Override: Force applications to use your chosen model, regardless of hardcoded defaults
  • API Key Rotation: Aggregate and auto-rotate API keys to maximize free-tier usage
  • Test Execution Reminder: Automatically reminds agents to run tests before completing tasks (14+ languages)
  • LLM Assessment: Detect conversation loops and stuck patterns with intelligent monitoring
  • Tool Access Control: Fine-grained control over which tools LLMs can access
  • Dangerous Command Protection: Block destructive git operations before they cause damage
  • File Access Sandboxing: Restrict file operations to safe directories
  • Wire Capture & Debugging: Inspect and analyze all traffic for debugging
  • Random Model Replacement: Probabilistically swap models for session resilience and diversity - Feature Guide
  • Edit Precision Tuning: Auto-adjust parameters when models struggle with precise edits
  • Angel Verification: Real-time response verification with automatic correction
  • And 10+ more features - See User Guide for complete list

Quick Start

Installation

# Clone the repository
git clone https://github.com/matdev83/llm-interactive-proxy.git
cd llm-interactive-proxy

# Create virtual environment
python -m venv .venv
source .venv/Scripts/activate  # Windows: .venv\Scripts\activate

# Install dependencies
pip install -e .[dev]

Basic Usage

# Start the proxy with OpenAI backend
export OPENAI_API_KEY="your-key-here"
python -m src.core.cli --default-backend openai:gpt-4o

# Or with custom configuration
python -m src.core.cli --config config/my_config.yaml

For detailed setup instructions, see Quick Start Guide.

Documentation

Supported Front-end Interfaces

The proxy exposes multiple standard API surfaces, allowing you to use your favorite clients with any backend:

  • OpenAI Chat Completions (/v1/chat/completions) - Compatible with OpenAI SDKs and most tools.
  • OpenAI Responses (/v1/responses) - Optimized for structured output generation.
  • OpenAI Models (/v1/models) - Unified model discovery across all backends.
  • Anthropic Messages (/anthropic/v1/messages) - Native support for Claude clients/SDKs.
  • Dedicated Anthropic Server (http://host:8001/v1/messages) - Drop-in replacement for Anthropic API on a separate port (default: 8001).
  • Google Gemini v1beta (/v1beta/models, :generateContent) - Native support for Gemini tools.

See Front-End APIs Overview for more details.

Supported Backends

  • OpenAI (GPT-4, GPT-4o, o1)
  • Anthropic (Claude 3.5 Sonnet, Opus, Haiku)
  • Google Gemini (API Key, OAuth, GCP, Vertex AI)
  • OpenRouter (Access to 100+ models)
  • ZAI (Zhipu AI / GLM models)
  • Qwen (Alibaba Cloud Qwen models)
  • MiniMax (Hailuo AI reasoning models)
  • ZenMux (Unified model aggregator)
  • Cline (Specialized debugging backend)
  • Hybrid (Virtual backend for two-phase reasoning)

See Backends Overview for full details and configuration.

Support

License

This project is licensed under the MIT License.

Development

# Run tests
python -m pytest

# Run linter
python -m ruff --fix check .

# Format code
python -m black .

See Development Guide for more details.