Skip to content

fengjunhui/task-relay

Repository files navigation

Task Relay

Use any AI model as a Claude Code subagent — with full tool access.
Quick Start · 中文说明 · npm

npm version license stars


What is Task Relay?

Task Relay turns any Anthropic-compatible API endpoint into a full Claude Code subagent. The external model inherits CC's entire tool suite — Read, Write, Edit, Bash, Glob, Grep — and even skills and MCP servers.

How it works: claude -p supports the ANTHROPIC_BASE_URL environment variable. Task Relay wraps this into a simple CLI that manages model configs, session persistence, and context-aware forking.

┌─────────────────────────────┐
│     Claude Code (main)      │
│  relay run deepseek "task"  │
└──────────┬──────────────────┘
           │ Bash
           ▼
┌─────────────────────────────┐
│     Task Relay (wrapper)    │
│  reads config → injects env │
└──────────┬──────────────────┘
           │ subprocess
           ▼
┌─────────────────────────────┐
│  claude -p (external model) │
│  Full CC tools + skills     │
└─────────────────────────────┘

No custom agentic loop. No tool reimplementation. No MCP server. Just a thin config layer over claude -p.

Prerequisites

  • Claude Code installed and authenticated (claude command available)
  • Node.js >= 18
  • An API key from an Anthropic-compatible provider (e.g. MiniMax, DeepSeek)

Quick Start

# 1. Install
npm install -g @fengjunhui31/task-relay

# 2. Install Skill into Claude Code + generate config template
relay install claude

# 3. Edit ~/.relay.yaml — add your API key
# 4. Test
relay models
relay run minimax-m27 "say hello"

Edit ~/.relay.yaml — add your API key:

endpoints:
  minimax:
    base_url: https://api.minimaxi.com/anthropic
    api_key: your-api-key-here

models:
  minimax-m27:
    name: MiniMax M2.7
    endpoint: minimax
    model: MiniMax-M2.7
    capabilities: [text, code, agent]
    strengths: [Chinese, creative, general]
    context_window: 256000

preferences:
  coding: [minimax-m27]
  default: [minimax-m27]

Done. Claude Code now knows how to delegate tasks to external models.

Using with Claude Code

After relay install claude, a Skill is injected into CC. It intercepts subagent dispatch and routes tasks to external models automatically.

Implicit (Automatic)

Once the Skill is installed, CC will automatically use relay when it matches a task type to your preferences in ~/.relay.yaml. You don't need to do anything — just use CC normally:

You: "Review this file for security issues"
CC:  (internally) preferences.code_review → minimax-m27
     → runs `relay run minimax-m27 "review..." --allowedTools "Read,Glob,Grep"`
     → returns review result

Explicit

You can also explicitly ask CC to use relay:

You: "Use relay to refactor src/auth.js with minimax"
CC:  → relay run minimax-m27 "refactor src/auth.js" --allowedTools "Read,Write,Edit,Bash"

Smart Routing

The Skill includes routing intelligence — CC picks the right tools and options per task type:

Task Type Tools Options
Code review / research Read,Glob,Grep Read-only
Bug fix (single file) Read,Edit Minimal
Coding (multi-file) Read,Write,Edit,Glob,Grep,Bash Full
Copywriting / docs No tools needed
Needs conversation context relay fork

Updating the Skill

relay update claude    # re-install latest Skill from your relay version

Commands

Command Description
relay models List models, capabilities, preferences, and available claude -p options
relay run <model> "<task>" Relay a task — returns result + session ID
relay fork <model> "<task>" Fork CC's current session into an external model (auto-detects session, checks context size)
relay install claude Install the Skill into Claude Code
relay update Update relay itself
relay update claude Update the Claude Code Skill

Features

Session Persistence

Every relay run returns a session ID. CC decides whether to continue:

relay run minimax-m27 "analyze the codebase"
# → result + [relay] session: abc-123

relay run minimax-m27 "now refactor auth" --session-id abc-123
# → continues with full context from previous call

Context-Aware Fork

Fork CC's live session into an external model. Auto-detects the active session via process tree, and warns if context exceeds 60% of the target model's window:

relay fork minimax-m27 "continue this analysis with the full conversation context"

claude -p Passthrough

Extra arguments are forwarded to claude -p. Recommended options:

relay run minimax-m27 "review code" --allowedTools "Read,Glob,Grep"
relay run minimax-m27 "write docs" --bare
relay run minimax-m27 "refactor" --worktree
relay run minimax-m27 "explain in Chinese" --system-prompt "用中文回复"
relay run minimax-m27 "safe refactor" --permission-mode plan

Relay-specific options:

relay run minimax-m27 "continue" --session-id abc-123   # resume session
relay run minimax-m27 "big task" --timeout 900000        # custom timeout (ms, default 600s)

Three-Layer Config

# Layer 1: Endpoints — how to connect
endpoints:
  minimax:
    base_url: https://api.minimaxi.com/anthropic
    api_key: sk-xxx

# Layer 2: Models — what they can do
models:
  minimax-m27:
    endpoint: minimax
    capabilities: [text, code, agent]    # functional tags
    strengths: [Chinese, creative]       # domain strengths
    context_window: 256000

# Layer 3: Preferences — what to use when
preferences:
  coding: [deepseek-v3, minimax-m27]     # priority order
  copywriting: [minimax-m27]
  default: [minimax-m27]

When to Relay

Suitable Not Suitable
Code generation / refactoring Brainstorming
Code review Requirements clarification
Codebase research Interactive debugging
Writing docs / copy Teaching / explaining
Data analysis Deployment with manual steps

Rule of thumb: If the task can be fully described in one prompt and the result returned in one shot — relay it.

Benchmark: CC Token Savings

10 tasks (6 bug fix + 4 code review), comparing CC doing everything vs CC orchestrating + relay executing (MiniMax M2.7).

Results

Metric Group A (CC only) Group B (CC + relay) Delta
Work tokens (input+output) 8,977 5,381 -40.1%
Total tokens (incl. cache) 628,247 512,546 -18.4%
Cost (CC subscription) $1.32 $1.13 -14.2%
Correctness 10/10 10/10

By Task Type

Type Tasks A correct B correct Work token savings
Bug fix 6 6/6 6/6 27%
Code review 4 4/4 4/4 57%

Key Findings

  • Work tokens cut by ~40% — CC's orchestration overhead (dispatch + process result) is much smaller than doing the work itself
  • Code review benefits most — output tokens reduced by 57% (CC tends to be verbose; relay is more concise)
  • Cache overhead dominates — both groups pay ~48K-96K tokens for system context loading per task, diluting the work token savings in total numbers
  • Cost savings at 14% — cache tokens are cheap, so total cost savings lag behind work token savings
  • Time tradeoff — relay adds ~1.5-2x latency (external API round-trip), but runs in background so doesn't block the user

Run node bench/run.js && node bench/report.js to reproduce. See bench/ for details.

Roadmap: Long-Task Benchmark

Current benchmark uses short single-step tasks where cache overhead dominates. Next iteration will focus on long tasks (multi-file, multi-step) where relay savings compound:

  • SWE-bench Verified subset — real GitHub issue fixes requiring repo-wide exploration + multi-file edits
  • Multi-step refactoring — extract module → update imports → add tests → update docs
  • Codebase-wide review — review 5+ files, cross-reference issues, produce structured report
  • Expected delta: work tokens savings should exceed 60% on long tasks as CC's per-step overhead accumulates while relay handles everything in one external session

Compatible Endpoints

Any Anthropic-compatible API endpoint works. Non-Anthropic models (GPT-4o, Qwen, etc.) require a proxy like one-api or litellm.

License

MIT


中文说明

Task Relay 让 Claude Code 像使用原生 subagent 一样调用任意 AI 模型。

核心原理

claude -p 支持 ANTHROPIC_BASE_URL 环境变量。Task Relay 利用这一点,将 CC 运行时指向任意 Anthropic 兼容端点——外部模型天然获得 CC 全套工具(Read/Write/Edit/Bash 等),甚至 skills 和 MCP。

不需要自建 agentic loop,不需要工具库,不需要 MCP Server。只是 claude -p 上的薄配置层。

安装

npm install -g @fengjunhui31/task-relay
relay install claude    # 安装 Skill 到 Claude Code

编辑 ~/.relay.yaml,填入模型端点和 API key 即可使用。

三层配置

# 第一层:接入点(怎么连)
endpoints:
  minimax:
    base_url: https://api.minimaxi.com/anthropic
    api_key: your-key

# 第二层:模型(什么能力)
models:
  minimax-m27:
    name: MiniMax M2.7
    endpoint: minimax
    model: MiniMax-M2.7
    capabilities: [text, code, agent]       # 功能标签
    strengths: [中文, 创意写作, 通用]         # 擅长领域
    context_window: 256000

# 第三层:偏好(什么任务用什么)
preferences:
  coding: [minimax-m27]
  copywriting: [minimax-m27]
  default: [minimax-m27]

命令

relay models                                    # 查看模型、能力、偏好
relay run minimax-m27 "重构认证模块"               # 中继任务,返回 session ID
relay run minimax-m27 "继续" --session-id <id>    # resume 之前的 session
relay fork minimax-m27 "基于当前对话继续分析"       # fork CC 当前会话(自动检测 session + 上下文大小)
relay run minimax-m27 "写文案" --bare              # 纯净环境
relay install claude                             # 安装 Skill
relay update claude                              # 更新 Skill

在 Claude Code 中使用

安装 Skill 后,CC 会自动拦截匹配的任务并通过 relay 派发:

  • 隐式调用:正常使用 CC,当任务类型匹配 ~/.relay.yaml 中的 preferences 时,CC 自动选择 relay 而非自己处理
  • 显式调用:对 CC 说 "用 relay 帮我审查这段代码" 或 "用 minimax 重构这个文件"
  • Smart Routing:CC 自动按任务类型选择最小工具集(code review 只给 Read/Glob/Grep,bugfix 给 Read/Edit)
relay install claude   # 安装 Skill(首次)
relay update claude    # 更新 Skill(升级后)

评测结果

10 道题(6 bug fix + 4 code review),CC 自己做 vs CC + relay 协作:

指标 CC 独立完成 CC + relay 节省
工作 token(input+output) 8,977 5,381 -40%
CC 订阅成本 $1.32 $1.13 -14%
正确率 10/10 10/10

运行 node bench/run.js && node bench/report.js 复现。

适合中继的任务

代码生成/重构、代码审查、代码调研、文案/文档、数据分析——能用一个 prompt 完整描述、一次性返回结果的任务。

不适合中继的任务

头脑风暴、需求澄清、交互式调试、教学/解释——需要与用户多轮交互的任务。

About

让 Claude Code 像使用原生 subagent 一样调用任意 AI 模型

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors