AI coding harness / BL4CKP1NK 1N Y0UR AREA
- Native Rust TUI with a TypeScript harness engine
- Canonical sessions with
resume,hydrate, and saved session reopening - Mode-aware delegation across
JENNIE,LISA,ROSÉ, andJISOO - Team orchestration with parallel, sequential, and delegated runs
- Isolated delegated runs with git worktrees for provider-backed agent sessions
- Workflow state with todos, permission profiles, remote provider session state, detached background jobs, and automatic verification
- Sidebar rails for git status, subagents, detached background work, MCP servers, and LSP status
- LLM-powered context compaction with structured handoff (Goal, Decisions, Progress, Files) and automatic prune-before-summarize
- Delta streaming pipeline with ~80-byte events, blinking cursor, live token counter, and thinking breathing animation
- Muted inline tool call rendering and color-coded context meter in the footer bar
- Zone-differentiated visual encoding: recessed sidebar, elevated composer, and distinct message styles for user, assistant, and system content
- Result augmentation engine that injects behavioral nudges (verification reminders, diagnostic hints, tool suggestions) based on execution context
- API resilience with automatic retry, exponential backoff with jitter, and error classification (retryable vs auth vs fatal)
- Arrow-Up prompt history recall in the composer with deduplication and stash/restore
- Skills, hooks, MCP tools, LSP-backed retrieval, git-aware retrieval, and layered memory
- Per-session cost budget tracking with configurable warning thresholds and hard-stop enforcement
- Scored memory promotion pipeline with confidence metadata, Jaccard dedupe, and merge
- Parallel benchmarking with multi-model comparison, failure categorization, and cost tracking
- Parallelized boot sequence with two-phase
Promise.allinitialization and resilient error handling per phase - Research clarification interview that detects vague prompts and asks scoping questions before fan-out
- Non-blocking post-response verification that releases the composer immediately after streaming completes
- Path traversal protection in file tools with working-directory boundary enforcement
ddudu treats a coding harness as a small operating layer around provider runtimes rather than as a prompt wrapper.
The core system is organized around:
- an execution kernel for provider runtimes, tools, and permissions
- a context engine for retrieval, compaction, memory selection, result augmentation, and handoff
- a session/state layer for canonical transcripts, artifacts, jobs, and checkpoints
- an orchestration layer for routing, delegation, verification, and recovery
- an operator surface for making progress, ownership, and risk legible
The operator surface is not decoration. It is the control plane.
ddudu's UX is built on a few concrete beliefs:
- Perceived speed matters as much as actual speed. Streaming deltas, breathing animations, blinking cursors, and live token counters exist because a system that looks alive earns more trust than one that looks frozen.
- Visual encoding should carry meaning, not just color. The TUI uses zone-differentiated backgrounds (recessed sidebar, neutral main, elevated composer) so the operator can orient spatially. User messages are pure white, assistant messages are warm neutral, system messages are muted. These are not theme choices — they are information hierarchy decisions.
- Tool calls should recede, not compete. Muted inline rendering (✓/✗/spinner) keeps tool activity visible without flooding the transcript. The operator should see answers, not plumbing.
- The system should nudge the model, not just the operator. Result augmentation injects verification reminders and diagnostic hints after tool calls so the model self-corrects without waiting for the operator to notice a missed step.
- History is muscle memory. Arrow-Up prompt recall means repeated workflows are fast. The composer is an input device, not a one-shot text field.
The deeper technical notes live in docs/:
- Design Principles
- Harness Anatomy
- Context Engine
- Memory System
- Session And State
- Delegation And Artifacts
- Trust And Sandbox
- Operator Surface
| Mode | Provider | Model | Role |
|---|---|---|---|
JENNIE |
Anthropic | claude-opus-4-6 |
orchestration, verification, delegation |
LISA |
OpenAI / Codex | gpt-5.4 |
fast execution, low-overhead action |
ROSÉ |
Anthropic | claude-sonnet-4-6 |
planning, architecture, careful reasoning |
JISOO |
Gemini | gemini-2.5-pro |
design, UI/UX, visual thinking |
Recommended setup: run this default four-mode lineup together and let ddudu route or delegate between them as needed.
If only one provider is authenticated, ddudu still keeps the four-mode surface and resolves each mode to the best available fallback. In practice that means a Claude-only setup still gives you Opus 4.6 for orchestration and Sonnet 4.6 for planning/execution fallbacks, while a Codex-only setup collapses the modes onto GPT-5.4.
Shift+Tab cycles modes inside the TUI.
Additional native TUI shortcuts:
Ctrl+Kopens the command paletteCtrl+Yopens the saved-session pickerCtrl+Pstarts an@filepicker in the composerCtrl+Lclears the current transcriptCtrl+Cinterrupts the active run or exits when idle
| Tool | Purpose |
|---|---|
read_file |
read files into the working context |
write_file |
create or overwrite files |
edit_file |
patch existing files |
list_dir |
inspect directory contents |
git_status |
inspect repository status |
git_diff |
inspect working tree or staged diffs |
patch_apply |
validate or apply unified diff patches |
bash |
run shell commands |
lint_runner |
run lint/typecheck with structured summaries |
test_runner |
run tests with structured failure highlights |
build_runner |
run builds with structured output and summaries |
verify_changes |
run the harness verification loop over current changes |
grep |
search file contents |
glob |
match paths by pattern |
repo_map |
render a compact repository tree |
symbol_search |
find symbol definitions; mode: "resolve" for precise LSP-backed lookup |
reference_search |
find cross-file references and usages; group_by_file: true for hotspot aggregation |
changed_files |
list git-changed files for active-context retrieval |
file_importance |
rank likely relevant files for the current request |
codebase_search |
score files and lines against a natural-language query |
docs_lookup |
search local repo docs, instructions, and knowledge files |
web_search |
search the web with concise ranked results |
web_fetch |
fetch and summarize remote pages |
task |
delegate work to a sub-agent |
oracle |
ask a stronger secondary model for a focused answer |
ask_question |
pause and request user input inside a run |
memory |
read or write persistent memory |
update_plan |
manage the shared execution plan / todo list |
Today the supported install path is from source.
Because the TUI is a native Rust binary, a portable npm install -g ddududdudu release needs platform-specific packaged binaries first. Until that release pipeline exists, install from source on the target machine.
- Node.js
>= 20 - Rust stable toolchain
npm install
npm run install:global
ddudunpm run install:global packs the current checkout and installs that tarball globally, which avoids the live symlink behavior of npm install -g . on newer npm versions. This is the recommended source install path.
ddudu <command> --help and ddudu help <command> print command-specific usage.
If you want to run the equivalent steps manually:
TARBALL="$(npm pack --silent)"
npm install -g "./$TARBALL"
rm "./$TARBALL"If you are developing ddudu itself and explicitly want a live symlink into the current repo, use:
npm install
npm run build
npm linkIf you do not want a global install:
npm install
npm run build
node dist/index.jsddudu can reuse existing provider auth instead of forcing new secrets everywhere.
Supported auth paths today:
- Claude:
claude auth loginorANTHROPIC_API_KEY - Codex/OpenAI:
codex loginorOPENAI_API_KEY - Gemini:
GEMINI_API_KEYor~/.gemini/oauth_creds.json
Check what ddudu sees:
ddudu authStart or refresh login from ddudu:
ddudu auth login
ddudu auth login claude
ddudu auth login codex
ddudu auth login codex --api-keyddudu auth login opens an interactive Arrow-key picker. You can choose a vendor login flow or register an API key in ~/.ddudu/auth.yaml, depending on the provider.
After login, ddudu rechecks local credentials and shows how the current four-mode lineup resolves against the providers you actually have available.
ddudu keeps one canonical session and layers provider-specific sessions on top of it.
By default, ddudu stores sessions and operator settings in ~/.ddudu/ so the experience follows you across repositories.
Project-local .ddudu/config.yaml remains available as an explicit override layer when a repo really needs different policy.
ddudu init scaffolds .ddudu/config.yaml, .ddudu/DDUDU.md, root AGENTS.md, and starter hook templates. AGENTS.md is loaded automatically as the shared cross-tool instruction file.
session list,session last, andsession resume <id>reopen saved sessionsresume,/resume, and/session resumeare all supported aliases for reopening saved worksession pickopens an interactive saved-session picker with Arrow-key selection- provider runtimes keep remote session IDs so the harness can
resumeorhydratewhen context advances - delegated execution can spin up isolated git worktrees instead of sharing the parent working tree
- background runs can continue as detached workers, keep inspectable job state, and can be retried, promoted, cancelled, or reopened later
/planand/todomanage the shared execution plan/permissionsswitches betweenplan,ask,workspace-write, andpermissionless, and can pin per-tool, network, and secret trust policies- direct and delegated execution paths can auto-run review checks, repair retries, and verification summaries
- successful verified runs can promote compact semantic and procedural memory entries automatically
/handoff,/fork,/briefing, and/drifthelp carry long-running work forward without losing context
ddudu # launch TUI
ddudu auth # show detected auth
ddudu init # initialize .ddudu/ in current project
ddudu doctor # basic environment check
ddudu config show # print merged config
ddudu config set tools.permission ask # write global config by default
ddudu config set --project tools.permission ask # write a project override
ddudu session list # list saved sessions
ddudu session pick # choose a saved session interactively
ddudu session last # reopen the most recent saved session
ddudu session resume <id> # reopen a saved session in the native TUI
ddudu resume # quick alias for the latest saved session
ddudu resume <id> # quick alias for reopening a specific sessionddudu config show prints the merged config with sensitive values redacted.
The benchmark system runs harness tasks against configurable task packs and produces structured JSONL reports.
# Run benchmarks with default settings
node bench/run.mjs --tasks bench/tasks.example.yaml
# Parallel execution with concurrency control
node bench/run.mjs --tasks bench/tasks.yaml --concurrency 5
# Multi-model comparison
node bench/run.mjs --tasks bench/tasks.yaml --models claude-opus-4-6,gpt-5.4
# Resume an interrupted run (skips completed tasks)
node bench/run.mjs --tasks bench/tasks.yaml --resume --output bench/results/run-prev.jsonl
# Generate a report
node bench/report.mjs bench/results/run-*.jsonlReports include:
- per-difficulty pass rate breakdown
- failure mode categorization (timeout, crash, setup-failed, verification-failed, wrong-output)
- cost tracking (estimated USD per task and total)
- multi-model side-by-side comparison when multiple models are present
Task packs and the full comparison workflow are still being refined.
| Key | Action |
|---|---|
Shift+Tab |
cycle mode |
Shift+Enter / Ctrl+J |
newline in composer |
Enter |
submit |
Esc |
interrupt running request / clear composer |
Up |
recall previous prompt (when composer is empty) |
Down |
navigate forward in prompt history |
Up / Down |
scroll transcript (when history is inactive) |
PgUp / PgDn |
jump scroll |
End |
follow latest output |
| Command | Purpose |
|---|---|
/clear |
clear the current transcript |
/compact |
LLM-summarized context compaction with structured handoff |
/mode |
switch active mode |
/model |
switch the current mode's model |
/plan |
show the shared execution plan |
/todo |
add, update, or clear plan items |
/permissions |
change the active permission profile or configure per-tool, network, and secret trust policies |
/memory |
read, write, append, or clear scoped memory |
/session |
list sessions or resume a saved session |
/resume |
quick alias for resuming the last or a specific saved session |
/config |
show runtime config summary |
/help |
show available commands |
/doctor |
show runtime health and context info |
/context |
inspect the active prompt context snapshot |
/queue |
inspect, run, promote, drop, or clear queued prompts |
/jobs |
inspect, logs, result, retry, promote, or cancel detached background jobs |
/artifacts |
inspect recent typed artifacts |
/review |
run review checks against the current diff |
/checkpoint |
create a git checkpoint commit |
/undo |
revert the last ddudu checkpoint |
/handoff |
compact context into a new handoff session |
/fork |
fork the current session |
/briefing |
generate and save a session briefing |
/drift |
compare current repo state with the latest briefing |
/fire |
fast toggle permissionless mode |
/init |
initialize .ddudu/ files |
/skill |
inspect or load skills |
/hook |
inspect or reload file-based hooks |
/mcp |
inspect, add, trust, enable, disable, remove, or reload MCP servers |
/team |
run multi-agent orchestration |
/quit / /exit |
exit ddudu |
MIT
Inspired by minpeter 🍀
