Use any AI model as a Claude Code subagent — with full tool access.
Quick Start · 中文说明 · npm
Task Relay turns any Anthropic-compatible API endpoint into a full Claude Code subagent. The external model inherits CC's entire tool suite — Read, Write, Edit, Bash, Glob, Grep — and even skills and MCP servers.
How it works: claude -p supports the ANTHROPIC_BASE_URL environment variable. Task Relay wraps this into a simple CLI that manages model configs, session persistence, and context-aware forking.
┌─────────────────────────────┐
│ Claude Code (main) │
│ relay run deepseek "task" │
└──────────┬──────────────────┘
│ Bash
▼
┌─────────────────────────────┐
│ Task Relay (wrapper) │
│ reads config → injects env │
└──────────┬──────────────────┘
│ subprocess
▼
┌─────────────────────────────┐
│ claude -p (external model) │
│ Full CC tools + skills │
└─────────────────────────────┘
No custom agentic loop. No tool reimplementation. No MCP server. Just a thin config layer over claude -p.
- Claude Code installed and authenticated (
claudecommand available) - Node.js >= 18
- An API key from an Anthropic-compatible provider (e.g. MiniMax, DeepSeek)
# 1. Install
npm install -g @fengjunhui31/task-relay
# 2. Install Skill into Claude Code + generate config template
relay install claude
# 3. Edit ~/.relay.yaml — add your API key
# 4. Test
relay models
relay run minimax-m27 "say hello"Edit ~/.relay.yaml — add your API key:
endpoints:
minimax:
base_url: https://api.minimaxi.com/anthropic
api_key: your-api-key-here
models:
minimax-m27:
name: MiniMax M2.7
endpoint: minimax
model: MiniMax-M2.7
capabilities: [text, code, agent]
strengths: [Chinese, creative, general]
context_window: 256000
preferences:
coding: [minimax-m27]
default: [minimax-m27]Done. Claude Code now knows how to delegate tasks to external models.
After relay install claude, a Skill is injected into CC. It intercepts subagent dispatch and routes tasks to external models automatically.
Once the Skill is installed, CC will automatically use relay when it matches a task type to your preferences in ~/.relay.yaml. You don't need to do anything — just use CC normally:
You: "Review this file for security issues"
CC: (internally) preferences.code_review → minimax-m27
→ runs `relay run minimax-m27 "review..." --allowedTools "Read,Glob,Grep"`
→ returns review result
You can also explicitly ask CC to use relay:
You: "Use relay to refactor src/auth.js with minimax"
CC: → relay run minimax-m27 "refactor src/auth.js" --allowedTools "Read,Write,Edit,Bash"
The Skill includes routing intelligence — CC picks the right tools and options per task type:
| Task Type | Tools | Options |
|---|---|---|
| Code review / research | Read,Glob,Grep |
Read-only |
| Bug fix (single file) | Read,Edit |
Minimal |
| Coding (multi-file) | Read,Write,Edit,Glob,Grep,Bash |
Full |
| Copywriting / docs | — | No tools needed |
| Needs conversation context | — | relay fork |
relay update claude # re-install latest Skill from your relay version| Command | Description |
|---|---|
relay models |
List models, capabilities, preferences, and available claude -p options |
relay run <model> "<task>" |
Relay a task — returns result + session ID |
relay fork <model> "<task>" |
Fork CC's current session into an external model (auto-detects session, checks context size) |
relay install claude |
Install the Skill into Claude Code |
relay update |
Update relay itself |
relay update claude |
Update the Claude Code Skill |
Every relay run returns a session ID. CC decides whether to continue:
relay run minimax-m27 "analyze the codebase"
# → result + [relay] session: abc-123
relay run minimax-m27 "now refactor auth" --session-id abc-123
# → continues with full context from previous callFork CC's live session into an external model. Auto-detects the active session via process tree, and warns if context exceeds 60% of the target model's window:
relay fork minimax-m27 "continue this analysis with the full conversation context"Extra arguments are forwarded to claude -p. Recommended options:
relay run minimax-m27 "review code" --allowedTools "Read,Glob,Grep"
relay run minimax-m27 "write docs" --bare
relay run minimax-m27 "refactor" --worktree
relay run minimax-m27 "explain in Chinese" --system-prompt "用中文回复"
relay run minimax-m27 "safe refactor" --permission-mode planRelay-specific options:
relay run minimax-m27 "continue" --session-id abc-123 # resume session
relay run minimax-m27 "big task" --timeout 900000 # custom timeout (ms, default 600s)# Layer 1: Endpoints — how to connect
endpoints:
minimax:
base_url: https://api.minimaxi.com/anthropic
api_key: sk-xxx
# Layer 2: Models — what they can do
models:
minimax-m27:
endpoint: minimax
capabilities: [text, code, agent] # functional tags
strengths: [Chinese, creative] # domain strengths
context_window: 256000
# Layer 3: Preferences — what to use when
preferences:
coding: [deepseek-v3, minimax-m27] # priority order
copywriting: [minimax-m27]
default: [minimax-m27]| Suitable | Not Suitable |
|---|---|
| Code generation / refactoring | Brainstorming |
| Code review | Requirements clarification |
| Codebase research | Interactive debugging |
| Writing docs / copy | Teaching / explaining |
| Data analysis | Deployment with manual steps |
Rule of thumb: If the task can be fully described in one prompt and the result returned in one shot — relay it.
10 tasks (6 bug fix + 4 code review), comparing CC doing everything vs CC orchestrating + relay executing (MiniMax M2.7).
| Metric | Group A (CC only) | Group B (CC + relay) | Delta |
|---|---|---|---|
| Work tokens (input+output) | 8,977 | 5,381 | -40.1% |
| Total tokens (incl. cache) | 628,247 | 512,546 | -18.4% |
| Cost (CC subscription) | $1.32 | $1.13 | -14.2% |
| Correctness | 10/10 | 10/10 | — |
| Type | Tasks | A correct | B correct | Work token savings |
|---|---|---|---|---|
| Bug fix | 6 | 6/6 | 6/6 | 27% |
| Code review | 4 | 4/4 | 4/4 | 57% |
- Work tokens cut by ~40% — CC's orchestration overhead (dispatch + process result) is much smaller than doing the work itself
- Code review benefits most — output tokens reduced by 57% (CC tends to be verbose; relay is more concise)
- Cache overhead dominates — both groups pay ~48K-96K tokens for system context loading per task, diluting the work token savings in total numbers
- Cost savings at 14% — cache tokens are cheap, so total cost savings lag behind work token savings
- Time tradeoff — relay adds ~1.5-2x latency (external API round-trip), but runs in background so doesn't block the user
Run
node bench/run.js && node bench/report.jsto reproduce. Seebench/for details.
Current benchmark uses short single-step tasks where cache overhead dominates. Next iteration will focus on long tasks (multi-file, multi-step) where relay savings compound:
- SWE-bench Verified subset — real GitHub issue fixes requiring repo-wide exploration + multi-file edits
- Multi-step refactoring — extract module → update imports → add tests → update docs
- Codebase-wide review — review 5+ files, cross-reference issues, produce structured report
- Expected delta: work tokens savings should exceed 60% on long tasks as CC's per-step overhead accumulates while relay handles everything in one external session
Any Anthropic-compatible API endpoint works. Non-Anthropic models (GPT-4o, Qwen, etc.) require a proxy like one-api or litellm.
MIT
Task Relay 让 Claude Code 像使用原生 subagent 一样调用任意 AI 模型。
claude -p 支持 ANTHROPIC_BASE_URL 环境变量。Task Relay 利用这一点,将 CC 运行时指向任意 Anthropic 兼容端点——外部模型天然获得 CC 全套工具(Read/Write/Edit/Bash 等),甚至 skills 和 MCP。
不需要自建 agentic loop,不需要工具库,不需要 MCP Server。只是 claude -p 上的薄配置层。
npm install -g @fengjunhui31/task-relay
relay install claude # 安装 Skill 到 Claude Code编辑 ~/.relay.yaml,填入模型端点和 API key 即可使用。
# 第一层:接入点(怎么连)
endpoints:
minimax:
base_url: https://api.minimaxi.com/anthropic
api_key: your-key
# 第二层:模型(什么能力)
models:
minimax-m27:
name: MiniMax M2.7
endpoint: minimax
model: MiniMax-M2.7
capabilities: [text, code, agent] # 功能标签
strengths: [中文, 创意写作, 通用] # 擅长领域
context_window: 256000
# 第三层:偏好(什么任务用什么)
preferences:
coding: [minimax-m27]
copywriting: [minimax-m27]
default: [minimax-m27]relay models # 查看模型、能力、偏好
relay run minimax-m27 "重构认证模块" # 中继任务,返回 session ID
relay run minimax-m27 "继续" --session-id <id> # resume 之前的 session
relay fork minimax-m27 "基于当前对话继续分析" # fork CC 当前会话(自动检测 session + 上下文大小)
relay run minimax-m27 "写文案" --bare # 纯净环境
relay install claude # 安装 Skill
relay update claude # 更新 Skill安装 Skill 后,CC 会自动拦截匹配的任务并通过 relay 派发:
- 隐式调用:正常使用 CC,当任务类型匹配
~/.relay.yaml中的preferences时,CC 自动选择 relay 而非自己处理 - 显式调用:对 CC 说 "用 relay 帮我审查这段代码" 或 "用 minimax 重构这个文件"
- Smart Routing:CC 自动按任务类型选择最小工具集(code review 只给 Read/Glob/Grep,bugfix 给 Read/Edit)
relay install claude # 安装 Skill(首次)
relay update claude # 更新 Skill(升级后)10 道题(6 bug fix + 4 code review),CC 自己做 vs CC + relay 协作:
| 指标 | CC 独立完成 | CC + relay | 节省 |
|---|---|---|---|
| 工作 token(input+output) | 8,977 | 5,381 | -40% |
| CC 订阅成本 | $1.32 | $1.13 | -14% |
| 正确率 | 10/10 | 10/10 | — |
运行
node bench/run.js && node bench/report.js复现。
代码生成/重构、代码审查、代码调研、文案/文档、数据分析——能用一个 prompt 完整描述、一次性返回结果的任务。
头脑风暴、需求澄清、交互式调试、教学/解释——需要与用户多轮交互的任务。