This setup provides a unified OpenAI-compatible API gateway with working MCP (Model Context Protocol) tools integration through the litellm-proxy and mcp-proxy proxies.
┌─────────────────┐ ┌──────────────────┐ ┌────────────────┐
│ Client Apps │────│ LiteLLM Proxy │────│ vLLM │
│ │ │ (Port 4000) │ │ OpenRouter │
│ │ └──────────────────┘ └────────────────┘
│ │ │
│ │ │ (Integration Planned)
│ │ ▼
│ │ ┌──────────────────┐ ┌────────────────┐
│ │────│ MCP Proxy │────│ Tavily Search │
└─────────────────┘ │ (Port 4004) │ │ + Other Tools │
└──────────────────┘ └────────────────┘
Copy the example environment file and configure your API keys:
cp .env.example .envEdit .env and set:
LITELLM_MASTER_KEY: Your proxy admin key (start withsk-)OPENROUTER_API_KEY: Your OpenRouter API keyTAVILY_API_KEY: Your Tavily API key for web search MCP tools
# Start both LiteLLM and MCP Proxy services
docker-compose up -dThis will start:
- LiteLLM Proxy on port 4000 (OpenAI-compatible API)
- MCP Proxy on port 4004 (Working MCP tools)
# Test LiteLLM models
chmod +x test_models.sh
./test_models.sh
# Test MCP tools directly (working approach)
chmod +x test_mcp_proxy.sh
./test_mcp_proxy.sh| Model Name | Type | Endpoint | Status |
|---|---|---|---|
vllm-Qwen2.5-VL-32B |
Internal | 10.0.0.10:7113 |
✅ Working |
openrouter-qwen3-32b |
External | OpenRouter | ✅ Working |
| Tool Name | Description | Status |
|---|---|---|
tavily-search |
Web search via Tavily API | ✅ Working |
tavily-extract |
Extract content from URLs | ✅ Working |
tavily-crawl |
Crawl websites for data | ✅ Working |
tavily-map |
Map and structure information | ✅ Working |
✅ Working Endpoint: http://localhost:4004/tavily/mcp
- LiteLLM Proxy: Full OpenAI-compatible API for LLM models
- MCP Proxy: Direct access to Tavily search tools via HTTP JSON-RPC
- Independent Operation: Both services work independently
- LiteLLM MCP Integration (BETA): Currently has connectivity issues
- Auto-discovery: LiteLLM doesn't auto-discover MCP tools from mcp-proxy yet
Use MCP tools directly through the mcp-proxy endpoints (see examples below).
curl -X POST http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $LITELLM_MASTER_KEY" \
-d '{
"model": "vllm-Qwen2.5-VL-32B",
"messages": [
{"role": "user", "content": "Hello! How are you?"}
]
}'# Search for current events using Tavily
curl -X POST http://localhost:4004/tavily/mcp \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": {
"name": "tavily-search",
"arguments": {
"query": "latest AI news 2025"
}
}
}'# List all available MCP tools
curl -X POST http://localhost:4004/tavily/mcp \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/list"
}'Use the mcp-proxy directly for reliable tool access:
- Endpoint:
http://localhost:4004/tavily/mcp - Transport: HTTP JSON-RPC
- Auth: None required
- Use LiteLLM for LLM inference
- Use MCP Proxy for tool calls
- Combine results in your application logic
Once LiteLLM's MCP beta issues are resolved, full integration will be available.
- MCP.md - Detailed MCP Proxy setup and configuration
- LiteLLM Docs - Official LiteLLM documentation
├── docker-compose.yml # Both LiteLLM and MCP Proxy services
├── config.yaml # LiteLLM configuration
├── mcp-proxy-config.json # MCP Proxy configuration
├── .env # Environment variables
├── test_models.sh # LiteLLM test script
├── test_mcp_proxy.sh # MCP Proxy test script
├── test_mcp_proxy2.sh # Additional MCP tests
├── README.md # This file
└── MCP.md # Detailed MCP setup guide
For detailed MCP Proxy setup and configuration, see MCP.md.