feat: add Kimi Code CLI engine support #148

gmaijoe · 2026-02-10T22:14:19Z

Summary

Adds Kimi Code CLI as a new AI engine (--kimi flag)
Kimi CLI supports stream-json output format, --yolo auto-approve mode, and --model override — same patterns as existing engines (Claude, Gemini, Qwen)
Registers engine in types, index, and CLI args

Usage

ralphy --kimi "your task"
ralphy --kimi --model kimi-k2.5 "your task"

Changes

File	Change
`cli/src/engines/kimi.ts`	New engine with `execute` and `executeStreaming`
`cli/src/engines/types.ts`	Add `"kimi"` to `AIEngineName` union
`cli/src/engines/index.ts`	Export and register `KimiEngine`
`cli/src/cli/args.ts`	Add `--kimi` CLI flag

Test plan

Verify ralphy --kimi selects the Kimi engine
Verify kimi CLI availability check works
Test with Kimi CLI installed: ralphy --kimi "hello world task"
Verify --model override passes through correctly

🤖 Generated with Claude Code

Add Kimi Code (https://github.com/MoonshotAI/kimi-cli) as a supported AI engine. Kimi CLI supports stream-json output format, --yolo mode, and --model override, following the same patterns as existing engines. Usage: ralphy --kimi "your task" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vercel · 2026-02-10T22:14:24Z

@gmaijoe is attempting to deploy a commit to the Goshen Labs Team on Vercel.

A member of the Team first needs to authorize it.

dosubot · 2026-02-10T22:16:16Z

Related Documentation

5 document(s) may need updating based on files changed in this PR:

Goshen Labs's Space

AI Engine Addition Process

View Suggested Changes

@@ -1,10 +1,10 @@
 ## Architecture and Extension Points
 AI engines in Ralphy are implemented as classes extending a shared `BaseAIEngine` abstract class. Each engine defines its name, CLI command, and execution logic. Engines are registered in `cli/src/engines/index.ts` and instantiated via the `createEngine` function, which maps engine names to their respective classes. This modular approach ensures consistent integration and simplifies extension for new engines ([source](https://github.com/michaelshimeles/ralphy/blob/fc2df589969b5fe16d31eccb4e7ff91314e31776/cli/src/engines/index.ts#L3-L56)).
 
-Supported engines include Claude Code, OpenCode, Cursor, Codex, Qwen-Code, Factory Droid, GitHub Copilot, Trae Agent, Gemini CLI, and Ollama (via Claude Code CLI). Each engine follows the same integration pattern, allowing for consistent behavior and easy extensibility.
+Supported engines include Claude Code, OpenCode, Cursor, Codex, Qwen-Code, Factory Droid, GitHub Copilot, Trae Agent, Gemini CLI, Ollama (via Claude Code CLI), and Kimi Code. Each engine follows the same integration pattern, allowing for consistent behavior and easy extensibility.
 
 ## Adding Command-Line Flags
-To add a new engine, define a unique command-line flag in the CLI argument parser. Ralphy uses the `commander` library in its TypeScript CLI to declare flags such as `--droid` for Factory Droid, `--qwen` for Qwen-Code, `--trae` for Trae Agent, `--gemini` for Gemini CLI, and `--ollama` for Ollama. Update the argument parsing logic to set the engine name when the flag is present ([source](https://github.com/michaelshimeles/ralphy/blob/fc2df589969b5fe16d31eccb4e7ff91314e31776/cli/src/cli/args.ts#L26-L145)).
+To add a new engine, define a unique command-line flag in the CLI argument parser. Ralphy uses the `commander` library in its TypeScript CLI to declare flags such as `--droid` for Factory Droid, `--qwen` for Qwen-Code, `--trae` for Trae Agent, `--gemini` for Gemini CLI, `--kimi` for Kimi Code, and `--ollama` for Ollama. Update the argument parsing logic to set the engine name when the flag is present ([source](https://github.com/michaelshimeles/ralphy/blob/fc2df589969b5fe16d31eccb4e7ff91314e31776/cli/src/cli/args.ts#L26-L145)).
 
 Example:
 ```typescript
@@ -13,6 +13,7 @@
   .option("--qwen", "Use Qwen-Code")
   .option("--trae", "Use Trae Agent")
   .option("--gemini", "Use Gemini CLI")
+  .option("--kimi", "Use Kimi Code")
   .option("--ollama", "Use Ollama (local models via Claude Code)");
 ```
 Argument parsing:
@@ -24,32 +25,33 @@
 else if (opts.codex) aiEngine = "codex";
 else if (opts.qwen) aiEngine = "qwen";
 else if (opts.droid) aiEngine = "droid";
-else if (opts.trae) aiEngine = "trae";
+else if (opts.copilot) aiEngine = "copilot";
 else if (opts.gemini) aiEngine = "gemini";
+else if (opts.kimi) aiEngine = "kimi";
 else if (opts.ollama) aiEngine = "ollama";
 ```
 
-This ensures that when the `--ollama` flag is provided, the Ollama engine (via Claude Code CLI) is selected for execution.
+This ensures that when the `--kimi` flag is provided, the Kimi Code engine is selected for execution.
 
 ## Engine Implementation and Command Execution
 Implement the engine as a class extending `BaseAIEngine`. Specify the CLI command and provide `execute` and optionally `executeStreaming` methods. Use the shared `execCommand` and `execCommandStreaming` utilities for command execution, which handle cross-platform compatibility (Node.js, Bun, Windows command wrappers).
 
 **Command Execution Details:**
-- On Windows, npm global packages (like `claude`, `gemini`, or `ollama` via Claude Code) are installed as `.cmd` wrapper scripts. To execute these reliably, Ralphy now uses `shell: true` when spawning processes with Node.js, and wraps commands with `cmd.exe /c` when using Bun. This ensures that `.cmd` wrappers are properly invoked and avoids ENOENT errors, without needing to manually resolve the command path.
+- On Windows, npm global packages (like `claude`, `gemini`, `kimi`, or `ollama` via Claude Code) are installed as `.cmd` wrapper scripts. To execute these reliably, Ralphy now uses `shell: true` when spawning processes with Node.js, and wraps commands with `cmd.exe /c` when using Bun. This ensures that `.cmd` wrappers are properly invoked and avoids ENOENT errors, without needing to manually resolve the command path.
 - The `windowsVerbatimArguments` option is set to `false` on Windows and `true` on other platforms to prevent argument escaping issues.
 - If a process spawn error occurs, the error message is included in `stderr` (for `execCommand`) or reported via `onLine` (for `execCommandStreaming`), and the promise resolves with an exit code of 1. This maintains backward compatibility and avoids unhandled promise rejections.
 
-**Ollama Example:**
-```typescript
-export class OllamaEngine extends BaseAIEngine {
-  name = "Ollama (Claude Code)";
-  cliCommand = "claude";
+**Kimi Code Example:**
+```typescript
+export class KimiEngine extends BaseAIEngine {
+  name = "Kimi Code";
+  cliCommand = "kimi";
 
   async execute(prompt: string, workDir: string, options?: EngineOptions): Promise<AIResult> {
-    const args = ["--dangerously-skip-permissions", "--verbose", "--output-format", "stream-json"];
-    // Default model for Ollama (can be overridden)
-    const model = options?.modelOverride || "glm-4.7-flash";
-    args.push("--model", model);
+    const args = ["--yolo", "--output-format", "stream-json"];
+    if (options?.modelOverride) {
+      args.push("--model", options.modelOverride);
+    }
     if (options?.engineArgs && options.engineArgs.length > 0) {
       args.push(...options.engineArgs);
     }
@@ -60,17 +62,11 @@
     } else {
       args.push("-p", prompt);
     }
-    // Set Ollama-specific environment variables for Claude Code
-    const ollamaEnv = {
-      ANTHROPIC_AUTH_TOKEN: "ollama",
-      ANTHROPIC_API_KEY: "",
-      ANTHROPIC_BASE_URL: process.env.OLLAMA_BASE_URL || "http://localhost:11434",
-    };
     const { stdout, stderr, exitCode } = await execCommand(
       this.cliCommand,
       args,
       workDir,
-      ollamaEnv,
+      undefined,
       stdinContent,
     );
     // ...parse output...
@@ -78,6 +74,15 @@
 }
 ```
 
+**Ollama Example:**
+```typescript
+export class OllamaEngine extends BaseAIEngine {
+  name = "Ollama (Claude Code)";
+  cliCommand = "claude";
+  // ...
+}
+```
+
 **Factory Droid Example:**
 ```typescript
 export class DroidEngine extends BaseAIEngine {
@@ -115,7 +120,7 @@
 For more details, see the implementation in `cli/src/engines/base.ts`. For the latest error handling and cross-platform logic, refer to the current implementation.
 
 ## JSON Output Parsing
-Parse engine output using utility functions or engine-specific logic. For Qwen-Code, Gemini CLI, Ollama (via Claude Code CLI), and Trae Agent, use `parseStreamJsonResult` to extract response text and token counts from JSON lines. For Factory Droid, parse lines for a `completion` event and extract `finalText` and `durationMs` ([source](https://github.com/michaelshimeles/ralphy/blob/fc2df589969b5fe16d31eccb4e7ff91314e31776/cli/src/engines/droid.ts#L13-L124), [source](https://github.com/michaelshimeles/ralphy/blob/fc2df589969b5fe16d31eccb4e7ff91314e31776/cli/src/engines/qwen.ts#L14-L100)).
+Parse engine output using utility functions or engine-specific logic. For Qwen-Code, Gemini CLI, Kimi Code, Ollama (via Claude Code CLI), and Trae Agent, use `parseStreamJsonResult` to extract response text and token counts from JSON lines. For Factory Droid, parse lines for a `completion` event and extract `finalText` and `durationMs` ([source](https://github.com/michaelshimeles/ralphy/blob/fc2df589969b5fe16d31eccb4e7ff91314e31776/cli/src/engines/droid.ts#L13-L124), [source](https://github.com/michaelshimeles/ralphy/blob/fc2df589969b5fe16d31eccb4e7ff91314e31776/cli/src/engines/qwen.ts#L14-L100)).
 
 Example for Factory Droid:
 ```typescript
@@ -189,21 +194,51 @@
 }
 ```
 
-For Gemini CLI, Qwen-Code, and Ollama (via Claude Code CLI), use `parseStreamJsonResult` to extract the response and token counts from the stream-json output format. This approach ensures that output from these engines is parsed correctly, extracting relevant response text and metrics.
+For Gemini CLI, Qwen-Code, Kimi Code, and Ollama (via Claude Code CLI), use `parseStreamJsonResult` to extract the response and token counts from the stream-json output format. This approach ensures that output from these engines is parsed correctly, extracting relevant response text and metrics.
 
 ## Integration with Parallel Execution and Merge Conflict Resolution
 Ralphy supports parallel execution via the `--parallel` and `--max-parallel` flags, creating isolated worktrees and branches for each agent/task. After execution, completed branches are merged back to the base branch. If merge conflicts occur, Ralphy uses AI-assisted conflict resolution by building a prompt listing conflicted files and running the selected engine to resolve conflicts. The process verifies that all conflicts are resolved before completing the merge ([source](https://github.com/michaelshimeles/ralphy/blob/fc2df589969b5fe16d31eccb4e7ff91314e31776/cli/src/execution/parallel.ts#L34-L372), [source](https://github.com/michaelshimeles/ralphy/blob/fc2df589969b5fe16d31eccb4e7ff91314e31776/cli/src/execution/conflict-resolution.ts#L8-L81)).
 
 ## Documentation and Changelog Updates
-Update the README to include the new engine flag (`--ollama`), usage examples, engine details table, and any relevant notes about output metrics (tokens, duration, cost). Add a changelog entry noting the new engine, its flag, and the version bump.
+Update the README to include the new engine flag (`--kimi`), usage examples, engine details table, and any relevant notes about output metrics (tokens, duration, cost). Add a changelog entry noting the new engine, its flag, and the version bump.
 
 Example changelog entry:
 ```
+### v4.7.0
+- **Kimi Code support**: use Kimi Code CLI (`--kimi`)
+- supports `--model` override, `--yolo` auto-approve, and `stream-json` output
+- requires [Kimi Code CLI](https://github.com/MoonshotAI/kimi-cli) installed
+```
+
 ### v4.6.0
 - **Ollama support**: use local models via Ollama's Anthropic-compatible API (`--ollama`)
 - recommended models: `qwen3-coder`, `glm-4.7`, `gpt-oss:20b`, `gpt-oss:120b`
 - requires [Ollama](https://ollama.com) running locally and Claude Code CLI installed
 ```
+
+### v4.5.3
+- parallel reliability: fallback to sandbox mode on worktree errors
+- error output: include CLI output snippet for failed engine commands
+
+Be sure to update any engine tables, usage sections, and notes about output metrics to include Kimi Code.
+
+**Example usage:**
+```bash
+ralphy --kimi "your task"
+ralphy --kimi --model kimi-k2.5 "your task"
+```
+
+**Engine details table (excerpt):**
+| Engine         | Flag      | CLI Requirement                  | Output Format   |
+|--------------- |---------- |----------------------------------|----------------|
+| Kimi Code      | --kimi    | kimi-cli (npm global)            | stream-json    |
+| Gemini CLI     | --gemini  | gemini-cli (npm global)          | stream-json    |
+| Qwen-Code      | --qwen    | qwen (npm global)                | stream-json    |
+| Ollama         | --ollama  | ollama + claude-cli (npm global) | stream-json    |
+| ...            | ...       | ...                              | ...            |
+```
+
+Be sure to update any engine tables, usage sections, and notes about output metrics to include Kimi Code.
 
 ### v4.5.3
 - parallel reliability: fallback to sandbox mode on worktree errors
@@ -223,12 +258,12 @@
 Ensure CLI tools are available in the system PATH and use platform-specific detection (`which` on Unix, `where` on Windows). For command execution, Ralphy now uses shell mode (`shell: true` for Node.js, or `cmd.exe /c` for Bun) on Windows to ensure that `.cmd` wrapper scripts for npm global packages are executed correctly. The `windowsVerbatimArguments` option is set to `false` on Windows and `true` on other platforms to prevent argument escaping issues. If a process spawn error occurs, the error message is included in `stderr` or reported via `onLine`, and the process resolves with an exit code of 1, maintaining backward compatibility and avoiding unhandled promise rejections. This approach avoids ENOENT errors and removes the need for custom path resolution logic. Provide clear error messages and installation instructions if the CLI is not found. Test on all supported platforms to confirm compatibility ([source](https://github.com/michaelshimeles/ralphy/issues/52), [source](https://github.com/michaelshimeles/ralphy/issues/64)).
 
 ## Example: Adding a New Engine
-1. Implement the engine class extending `BaseAIEngine` (see GeminiEngine for Gemini CLI as an example).
+1. Implement the engine class extending `BaseAIEngine` (see GeminiEngine for Gemini CLI or KimiEngine for Kimi Code as examples).
 2. Register the engine in `cli/src/engines/index.ts` and update the CLI argument parser.
-3. Add a command-line flag for engine selection (e.g., `--gemini`).
-4. Implement command execution and output parsing, using `parseStreamJsonResult` for Gemini CLI.
+3. Add a command-line flag for engine selection (e.g., `--gemini`, `--kimi`).
+4. Implement command execution and output parsing, using `parseStreamJsonResult` for Gemini CLI, Kimi Code, and Qwen-Code.
 5. Integrate with parallel execution and merge conflict resolution.
-6. Update documentation and changelog to include Gemini CLI.
+6. Update documentation and changelog to include the new engine.
 7. Verify functionality and cross-platform compatibility.
 
-By following these practices and using Factory Droid, Qwen-Code, and Gemini CLI as templates, you can safely and consistently add new AI engines to Ralphy.
+By following these practices and using Factory Droid, Qwen-Code, Gemini CLI, and Kimi Code as templates, you can safely and consistently add new AI engines to Ralphy.

[Accept] [Decline]

AI Engine Integration

View Suggested Changes

@@ -6,7 +6,7 @@
 **Engine-Specific Arguments:** You can pass arbitrary arguments to any engine using the `--` separator. Everything after `--` is forwarded directly to the engine CLI. See the 'Engine-Specific Arguments' section for details and examples.
 
 **Requirements:**
-- AI CLI: [Claude Code](https://github.com/anthropics/claude-code), [OpenCode](https://opencode.ai/docs/), [Cursor](https://cursor.com), Codex, Qwen-Code, [Factory Droid](https://docs.factory.ai/cli/getting-started/quickstart), [GitHub Copilot](https://docs.github.com/en/copilot/how-tos/use-copilot-agents/use-copilot-cli), [Gemini CLI](https://github.com/google-gemini/gemini-cli), or [Ollama](https://ollama.com) (requires Claude Code CLI)
+- AI CLI: [Claude Code](https://github.com/anthropics/claude-code), [OpenCode](https://opencode.ai/docs/), [Cursor](https://cursor.com), Codex, Qwen-Code, [Factory Droid](https://docs.factory.ai/cli/getting-started/quickstart), [GitHub Copilot](https://docs.github.com/en/copilot/how-tos/use-copilot-agents/use-copilot-cli), [Gemini CLI](https://github.com/google-gemini/gemini-cli), [Kimi Code CLI](https://github.com/MoonshotAI/kimi-cli), or [Ollama](https://ollama.com) (requires Claude Code CLI)
 - **npm version (`ralphy-cli`)**: Node.js 18+ or Bun
 
 Each engine requires its CLI tool installed and available in the system PATH.
@@ -23,6 +23,7 @@
 | Trae Agent    | `trae`             | `--trae`            | `--print --force --output-format stream-json` | stream-json | tokens, duration (ms) | Model override via `--model`. |
 | GitHub Copilot| `copilot`          | `--copilot`         | `--acp --stdio --yolo`              | NDJSON (ACP)       | none                 | Uses Agent Client Protocol (ACP) for structured communication and streaming. Token counts not available. Legacy engine deprecated. |
 | Gemini        | `gemini`           | `--gemini`          | `--output-format stream-json --yolo` | stream-json        | tokens + cost        | Model override via `--model`. |
+| Kimi Code     | `kimi`             | `--kimi`            | `--yolo --output-format stream-json` | stream-json        | tokens               | Model override via `--model`. CLI must be installed and in PATH. |
 | Ollama        | `claude`           | `--ollama`          | `--dangerously-skip-permissions` (Ollama env vars) | stream-json        | tokens + cost        | Runs local models via Claude Code CLI. |
 
 ## Engine Integration Details
@@ -89,6 +90,15 @@
 Example:
 ```bash
 ralphy --gemini --model gemini-pro "implement feature"
+```
+
+### Kimi Code
+Integrated as `KimiEngine`. Uses the `kimi` CLI with streaming JSON output. Model override via `--model <name>`. Supports `--yolo` auto-approve mode and accepts additional engine-specific arguments. Token counts are parsed from output. Failed commands return error messages with exit codes. CLI must be installed and available in PATH.
+
+Example:
+```bash
+ralphy --kimi "your task"
+ralphy --kimi --model kimi-k2.5 "your task"
 ```
 
 ### GitHub Copilot
@@ -136,6 +146,7 @@
 ralphy --qwen --model qwen-max "build api"
 ralphy --trae --model trae-pro "implement feature"
 ralphy --ollama --model glm-4.7 "add feature" # Ollama with specific model
+ralphy --kimi --model kimi-k2.5 "your task"   # Kimi Code with specific model
 ```
 
 # Engine-Specific Arguments
@@ -148,6 +159,9 @@
 
 # Pass claude-specific arguments
 ralphy --claude "add feature" -- --no-permissions-prompt
+
+# Pass kimi-specific arguments
+ralphy --kimi "your task" -- --custom-arg value
 
 # Works with any engine
 ralphy --cursor "fix bug" -- --custom-arg value
@@ -175,6 +189,7 @@
 - **Codex**: No token reporting; uses temp files for output.
 - **GitHub Copilot**: Requires Copilot CLI installed and available in PATH. Uses ACP protocol for structured communication and streaming. Token counts are not available. Legacy engine deprecated.
 - **Ollama**: Requires [Ollama](https://ollama.com) running locally and Claude Code CLI installed and available in PATH. Only models with at least 64k context window are supported. If either dependency is missing, tasks will fail with a clear error message.
+- **Kimi Code**: Requires [Kimi Code CLI](https://github.com/MoonshotAI/kimi-cli) installed and available in PATH. Token counts are available. Model override via `--model`. If CLI is missing, tasks will fail with a clear error message.
 - **General**: Each engine requires its CLI tool installed and available in the system PATH.
 
 ## Example Usage
@@ -188,6 +203,7 @@
 ralphy --trae "implement feature"              # Trae Agent
 ralphy --copilot "add feature"                 # GitHub Copilot (ACP)
 ralphy --gemini "summarize document"           # Gemini CLI
+ralphy --kimi "your task"                      # Kimi Code CLI
 ralphy --ollama "add feature"                  # Ollama (local models via Claude Code CLI)
 
 ralphy --opencode --model opencode/glm-4.7-free "custom model"
@@ -195,6 +211,7 @@
 ralphy --trae --model trae-pro "implement feature with Trae"
 ralphy --copilot "add feature" -- --allow-all-tools --stream on
 ralphy --gemini --model gemini-pro "generate code with Gemini"
+ralphy --kimi --model kimi-k2.5 "your task"
 ralphy --ollama --model glm-4.7 "add feature"  # Ollama with specific model
 ```
 
@@ -217,6 +234,7 @@
 - [GitHub Copilot ACP Documentation](https://docs.github.com/en/copilot/reference/acp-server)
 - [Agent Client Protocol Spec](https://agentclientprotocol.com/protocol/overview)
 - [TypeScript SDK](https://agentclientprotocol.com/libraries/typescript)
+- [Kimi Code CLI](https://github.com/MoonshotAI/kimi-cli)
 
 ---

[Accept] [Decline]

Model Override and Selection

View Suggested Changes

@@ -1,5 +1,5 @@
 ### Purpose and Usage
-By default, each engine (e.g., Claude, OpenCode, Qwen, Trae, Gemini, Ollama) uses its standard model. The `--model` flag lets you specify an alternative model for the selected engine. For convenience, shortcut flags like `--sonnet` are provided, which combine engine selection and model override in a single flag.
+By default, each engine (e.g., Claude, OpenCode, Qwen, Trae, Gemini, Ollama, Kimi) uses its standard model. The `--model` flag lets you specify an alternative model for the selected engine. For convenience, shortcut flags like `--sonnet` are provided, which combine engine selection and model override in a single flag.
 
 #### Per-Task Model Selection in YAML
 You can now specify a model for each individual task in your YAML task list using the `model:` property. When present, this per-task model takes precedence over any global model override specified via CLI flags. This allows you to mix and match models for different tasks within the same run, optimizing for cost, speed, or capability as needed.
@@ -71,7 +71,7 @@
 [Reference: cli/src/cli/args.ts, cli/src/execution/planning.ts, cli/src/execution/parallel.ts, cli/src/execution/sequential.ts]
 
 ### Specifying Models for Different Engines
-You can specify a model for any supported engine by combining the engine flag with `--model`. For example, to use a specific model with OpenCode, use `--opencode --model <model-name>`. The same pattern applies to other engines, including Trae, Gemini, and Ollama:
+You can specify a model for any supported engine by combining the engine flag with `--model`. For example, to use a specific model with OpenCode, use `--opencode --model <model-name>`. The same pattern applies to other engines, including Trae, Gemini, Ollama, and Kimi:
 
 ```bash
 ralphy --opencode --model opencode/glm-4.7-free "task"
@@ -79,6 +79,7 @@
 ralphy --trae --model trae-v1 "do something"
 ralphy --gemini --model gemini-1.0-pro "summarize"
 ralphy --ollama --model glm-4.7 "add feature"
+ralphy --kimi --model kimi-k2.5 "your task"
 ```
 
 You can also specify a separate planning model for any engine using `--planning-model <model-name>`. This model will be used only for the planning phase (file prediction), while the main model (from `--model`) is used for code generation and execution:
@@ -89,6 +90,7 @@
 ralphy --trae --model trae-v1 --planning-model trae-lite "task"
 ralphy --gemini --model gemini-1.0-pro --planning-model gemini-1.0 "summarize"
 ralphy --ollama --model glm-4.7 --planning-model glm-3.5 "add feature"
+ralphy --kimi --model kimi-k2.5 --planning-model kimi-k2.5 "your task"
 ```
 
 Only one engine/model combination can be specified per command invocation. There is no built-in support for specifying different models for multiple engines in a single command; you must run separate commands for each engine/model pair.
@@ -140,6 +142,14 @@
   ```bash
   ralphy --ollama --model glm-4.7 "add feature"
   ```
+- Use Kimi Code CLI with its default model:
+  ```bash
+  ralphy --kimi "your task"
+  ```
+- Use a custom model with Kimi Code CLI:
+  ```bash
+  ralphy --kimi --model kimi-k2.5 "your task"
+  ```
 - Use a separate planning model (e.g., for cost savings):
   ```bash
   ralphy --model opus --planning-model haiku "implement feature"
@@ -147,12 +157,13 @@
   ralphy --trae --model trae-v1 --planning-model trae-lite "task"
   ralphy --gemini --model gemini-1.0-pro --planning-model gemini-1.0 "summarize document"
   ralphy --ollama --model glm-4.7 --planning-model glm-3.5 "add feature"
+  ralphy --kimi --model kimi-k2.5 --planning-model kimi-k2.5 "your task"
   ```
 
 ### Best Practices
 - Use shortcut flags (like `--sonnet`) for common engine/model combinations to reduce typing and avoid mistakes.
-- Use `--model` with the appropriate engine flag for custom or less common model selections.
+- Use `--model` with the appropriate engine flag for custom or less common model selections, including `--kimi` for Kimi Code CLI.
 - Use `--planning-model` to select a cheaper or faster model for the planning phase, especially in parallel or no-git modes where planning is a distinct step. This can reduce cost and speed up planning without affecting code quality.
 - Ensure you specify only one engine/model pair per command.
 - Model overrides are consistently applied throughout the execution pipeline, including parallel runs and conflict resolution, so you can rely on the selected model being used for all phases of the operation.
-- Refer to the engine documentation or `ralphy --help` for the list of supported models for each engine.
+- Refer to the engine documentation or `ralphy --help` for the list of supported models for each engine, including Kimi Code CLI.

[Accept] [Decline]

Root User Detection and Restrictions

View Suggested Changes

@@ -2,6 +2,11 @@
 Ralphy includes logic to detect when it is being run as the root user. This detection is performed by checking if the effective user ID (EUID) or the output of `id -u` equals 0. If Ralphy determines it is running as root, it applies engine-specific restrictions and messaging to protect against unsafe or unsupported operations [source](https://github.com/michaelshimeles/ralphy/blob/fc2df589969b5fe16d31eccb4e7ff91314e31776/ralphy.sh#L622-L1056).
 
 **Note:** Ollama support is provided via the Claude Code CLI. When using Ollama (`--ollama`), all root user restrictions and behaviors are identical to those for Claude Code.
+
+**Kimi Code CLI** is supported as an AI engine (`--kimi` flag). Its root user behavior matches that of other engines that are not blocked as root (see below for details).
+
+### Restrictions for Claude Code and Cursor Engines
+When running as root, Ralphy does not allow the use of the Claude Code, Ollama (via Claude Code CLI), or Cursor engines. Attempting to use any of these engines as root results in an immediate error and process exit. This restriction is enforced regardless of any permission override flags.
 
 ### Restrictions for Claude Code and Cursor Engines
 When running as root, Ralphy does not allow the use of the Claude Code, Ollama (via Claude Code CLI), or Cursor engines. Attempting to use any of these engines as root results in an immediate error and process exit. This restriction is enforced regardless of any permission override flags.
@@ -18,13 +23,13 @@
 After displaying these messages, Ralphy exits with a non-zero status [source](https://github.com/michaelshimeles/ralphy/blob/fc2df589969b5fe16d31eccb4e7ff91314e31776/ralphy.sh#L622-L1056).
 
 ### Behavior for Other Engines
-For other supported engines—OpenCode, Codex, Qwen-Code, Factory Droid, Trae Agent, GitHub Copilot, and Gemini CLI—Ralphy does not enforce a hard restriction when running as root. Instead, it issues a warning:
+For other supported engines—OpenCode, Codex, Qwen-Code, Factory Droid, Trae Agent, GitHub Copilot, Gemini CLI, and Kimi Code CLI—Ralphy does not enforce a hard restriction when running as root. Instead, it issues a warning:
 
 ```
 WARNING: Running as root user. Some AI engines may have limited functionality.
 ```
 
-Execution continues, but some features may not work as expected due to permission or environment limitations [source](https://github.com/michaelshimeles/ralphy/blob/fc2df589969b5fe16d31eccb4e7ff91314e31776/ralphy.sh#L622-L1056). Trae Agent and Gemini CLI follow this behavior.
+Execution continues, but some features may not work as expected due to permission or environment limitations [source](https://github.com/michaelshimeles/ralphy/blob/fc2df589969b5fe16d31eccb4e7ff91314e31776/ralphy.sh#L622-L1056). Trae Agent, Gemini CLI, and Kimi Code CLI follow this behavior.
 
 **Note:** Ollama (via Claude Code CLI) is not included here, as it is blocked when running as root.
 
@@ -32,7 +37,7 @@
 If you encounter the root restriction, you have two main options:
 
 - Run Ralphy as a non-root user. This is the recommended and most secure approach for all engines.
-- If you must run as root, use an engine that does not enforce the root restriction: OpenCode (`--opencode`), Codex (`--codex`), Qwen-Code (`--qwen`), Factory Droid (`--droid`), Trae Agent (`--trae`), GitHub Copilot (`--copilot`), or Gemini CLI (`--gemini`) [source](https://github.com/michaelshimeles/ralphy/blob/fc2df589969b5fe16d31eccb4e7ff91314e31776/ralphy.sh#L622-L1056).
+- If you must run as root, use an engine that does not enforce the root restriction: OpenCode (`--opencode`), Codex (`--codex`), Qwen-Code (`--qwen`), Factory Droid (`--droid`), Trae Agent (`--trae`), GitHub Copilot (`--copilot`), Gemini CLI (`--gemini`), or Kimi Code CLI (`--kimi`) [source](https://github.com/michaelshimeles/ralphy/blob/fc2df589969b5fe16d31eccb4e7ff91314e31776/ralphy.sh#L622-L1056).
 
 **Note:** Ollama (via Claude Code CLI) is not available as an alternative when running as root.
 
@@ -45,7 +50,7 @@
     claude|ollama|cursor)
       log_error "Running as root is not supported with $AI_ENGINE."
       log_info "The --dangerously-skip-permissions flag cannot be used as root for security reasons."
-      log_info "Please run Ralphy as a non-root user, or use a different AI engine (--opencode, --codex, --qwen, --droid, --trae, --copilot, --gemini)."
+      log_info "Please run Ralphy as a non-root user, or use a different AI engine (--opencode, --codex, --qwen, --droid, --trae, --copilot, --gemini, --kimi)."
       exit 1
       ;;
     *)
@@ -56,7 +61,7 @@
 ```
 [source](https://github.com/michaelshimeles/ralphy/blob/fc2df589969b5fe16d31eccb4e7ff91314e31776/ralphy.sh#L622-L1056)
 
-Note: Trae Agent (`--trae`) and Gemini CLI (`--gemini`) are included among the engines that are not blocked as root. Ollama (via Claude Code CLI) is now blocked as root.
+Note: Trae Agent (`--trae`), Gemini CLI (`--gemini`), and Kimi Code CLI (`--kimi`) are included among the engines that are not blocked as root. Ollama (via Claude Code CLI) is now blocked as root.
 
 ### Summary Table
 | Engine         | Root Behavior         | User Message / Action                                                                 |
@@ -71,3 +76,4 @@
 | Trae Agent     | Warning, continue    | Warns about limited functionality, continues.                                        |
 | GitHub Copilot | Warning, continue    | Warns about limited functionality, continues.                                        |
 | Gemini CLI     | Warning, continue    | Warns about limited functionality, continues.                                        |
+| Kimi Code CLI  | Warning, continue    | Warns about limited functionality, continues.                                        |

[Accept] [Decline]

Root User Restrictions for AI Engines

View Suggested Changes

@@ -4,11 +4,11 @@
 ```
 Running as root is not supported with claude/cursor/ollama/gemini.
 The --dangerously-skip-permissions flag cannot be used as root for security reasons.
-Please run Ralphy as a non-root user, or use a different AI engine (--opencode, --codex, --qwen, --droid, --trae).
+Please run Ralphy as a non-root user, or use a different AI engine (--opencode, --codex, --qwen, --droid, --kimi, --trae).
 ```
 ([source](https://github.com/michaelshimeles/ralphy/blob/fc2df589969b5fe16d31eccb4e7ff91314e31776/ralphy.sh#L1044-L1057))
 
-For all other AI engines (OpenCode, Codex, Qwen, Factory Droid, Trae), Ralphy will continue to run as root but will display a warning:
+For all other AI engines (OpenCode, Codex, Qwen, Factory Droid, Kimi Code, Trae), Ralphy will continue to run as root but will display a warning:
 
 ```
 Running as root user. Some AI engines may have limited functionality.
@@ -21,7 +21,7 @@
 The restriction exists to prevent the use of the `--dangerously-skip-permissions` flag as root. Allowing this flag with root privileges would bypass critical permission checks, creating a risk of privilege escalation or unintended system modifications. This is especially relevant for the Claude Code, Cursor, Ollama, and Gemini engines, which rely on this flag for certain operations. Running these engines as root would expose the system to significant security vulnerabilities ([source](https://github.com/michaelshimeles/ralphy/blob/fc2df589969b5fe16d31eccb4e7ff91314e31776/ralphy.sh#L1044-L1057)).
 
 ### Recommended Alternatives
-To avoid these restrictions, run Ralphy as a non-root user. If running as root is unavoidable, select an AI engine that does not enforce this restriction: OpenCode (`--opencode`), Codex (`--codex`), Qwen (`--qwen`), Factory Droid (`--droid`), or Trae (`--trae`). These engines will operate with a warning but do not block execution ([source](https://github.com/michaelshimeles/ralphy/blob/fc2df589969b5fe16d31eccb4e7ff91314e31776/ralphy.sh#L1044-L1057)).
+To avoid these restrictions, run Ralphy as a non-root user. If running as root is unavoidable, select an AI engine that does not enforce this restriction: OpenCode (`--opencode`), Codex (`--codex`), Qwen (`--qwen`), Factory Droid (`--droid`), Kimi Code (`--kimi`), or Trae (`--trae`). These engines will operate with a warning but do not block execution ([source](https://github.com/michaelshimeles/ralphy/blob/fc2df589969b5fe16d31eccb4e7ff91314e31776/ralphy.sh#L1044-L1057)).
 
 Note: Gemini (`--gemini`) is not permitted as root and will block execution, similar to Claude, Cursor, and Ollama.

[Accept] [Decline]

Note: You must be authenticated to accept/decline updates.

^{How did I do? Any feedback?}

greptile-apps · 2026-02-10T22:16:31Z

Greptile Overview

Greptile Summary

Adds Kimi Code CLI as a new AI engine option with --kimi flag. Implementation follows the established pattern from Gemini and Qwen engines, supporting stream-json output format, --yolo auto-approve mode, and --model override.

Key changes:

New KimiEngine class in cli/src/engines/kimi.ts implementing both execute and executeStreaming methods
Registered "kimi" in the AIEngineName type union
Added --kimi CLI flag and updated help description
Engine factory properly instantiates KimiEngine

Observations:

Code is nearly identical to GeminiEngine implementation, suggesting Kimi CLI uses compatible API patterns
Properly handles Windows stdin workaround for multi-line prompts
Uses shared utilities from base.ts for command execution, error checking, and token parsing
Minor style note: argument order ["--yolo", "--output-format", "stream-json"] differs from Gemini's ["--output-format", "stream-json", "--yolo"] - verify this is intentional

Missing updates:

README.md still lists only 7 engines (Claude, OpenCode, Codex, Cursor, Qwen, Droid, Copilot, Gemini) and doesn't include Kimi in the AI Engines section or Engine Details table

Confidence Score: 4/5

Safe to merge with minor documentation updates needed
Implementation correctly follows established engine patterns with proper type safety and error handling. One style consideration regarding argument order, and README.md needs updating to document the new engine.
README.md should be updated to include Kimi in the engines list and details table

Important Files Changed

Filename	Overview
cli/src/engines/kimi.ts	New Kimi Code CLI engine implementation following established patterns from Gemini/Qwen engines
cli/src/cli/args.ts	Added `--kimi` flag and updated description to include Kimi Code
cli/src/engines/types.ts	Added "kimi" to AIEngineName union type
cli/src/engines/index.ts	Exported KimiEngine and registered it in the factory function

Sequence Diagram

sequenceDiagram
    participant User
    participant CLI as ralphy CLI
    participant Args as args.ts
    participant Factory as engines/index.ts
    participant Kimi as KimiEngine
    participant Base as BaseAIEngine
    participant KimiCLI as kimi CLI

    User->>CLI: ralphy --kimi "task"
    CLI->>Args: parseArgs()
    Args->>Args: detect --kimi flag
    Args-->>CLI: aiEngine = "kimi"
    CLI->>Factory: createEngine("kimi")
    Factory->>Kimi: new KimiEngine()
    Factory-->>CLI: engine instance
    CLI->>Kimi: executeStreaming(prompt, workDir, onProgress)
    Kimi->>Kimi: build args: ["--yolo", "--output-format", "stream-json"]
    Kimi->>Base: execCommandStreaming("kimi", args, workDir)
    Base->>KimiCLI: spawn kimi process
    loop Stream output
        KimiCLI-->>Base: stream-json lines
        Base->>Kimi: onLine callback
        Kimi->>Kimi: detectStepFromOutput()
        Kimi->>CLI: onProgress(step)
        CLI->>User: display progress
    end
    KimiCLI-->>Base: exit code
    Base-->>Kimi: output lines + exit code
    Kimi->>Kimi: parseStreamJsonResult()
    Kimi->>Kimi: checkForErrors()
    Kimi-->>CLI: AIResult (response, tokens)
    CLI-->>User: task complete

greptile-apps

_{4 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-10T22:16:35Z

cli/src/engines/kimi.ts

+	cliCommand = "kimi";
+
+	async execute(prompt: string, workDir: string, options?: EngineOptions): Promise<AIResult> {
+		const args = ["--yolo", "--output-format", "stream-json"];


Argument order differs from Gemini engine (["--output-format", "stream-json", "--yolo"]). Verify the order is correct for Kimi CLI.

Suggested change

const args = ["--yolo", "--output-format", "stream-json"];

const args = ["--output-format", "stream-json", "--yolo"];

Prompt To Fix With AI

This is a comment left during a code review. Path: cli/src/engines/kimi.ts Line: 23:23 Comment: Argument order differs from Gemini engine (`["--output-format", "stream-json", "--yolo"]`). Verify the order is correct for Kimi CLI. ```suggestion const args = ["--output-format", "stream-json", "--yolo"]; ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps bot reviewed Feb 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Kimi Code CLI engine support #148

feat: add Kimi Code CLI engine support #148

gmaijoe commented Feb 10, 2026

Uh oh!

vercel bot commented Feb 10, 2026

Uh oh!

dosubot bot commented Feb 10, 2026

Uh oh!

greptile-apps bot commented Feb 10, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	const args = ["--yolo", "--output-format", "stream-json"];
	const args = ["--output-format", "stream-json", "--yolo"];

feat: add Kimi Code CLI engine support #148

Are you sure you want to change the base?

feat: add Kimi Code CLI engine support #148

Conversation

gmaijoe commented Feb 10, 2026

Summary

Usage

Changes

Test plan

Uh oh!

vercel bot commented Feb 10, 2026

Uh oh!

dosubot bot commented Feb 10, 2026

AI Engine Addition Process

AI Engine Integration

Model Override and Selection

Root User Detection and Restrictions

Root User Restrictions for AI Engines

Uh oh!

greptile-apps bot commented Feb 10, 2026

Greptile Overview

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant