Skip to content

DUBSOpenHub/terminal-stampede

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

142 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿฆฌ Terminal Stampede

A parallel agent runtime for your terminal. Up to 20 AI agents. Zero infrastructure. Built-in quality scoring. Works with any CLI coding agent.

Latest Release

๐Ÿฆฌ Try the demo โ€” no setup required!

git clone https://github.com/DUBSOpenHub/terminal-stampede.git
cd terminal-stampede && ./install.sh

Zero API calls. Just tmux and bash. See agents work in real time.

Screenshot 2026-03-02 at 11 38 03โ€ฏPM

Animated demonstration of Terminal Stampede showing 8 AI agents working in parallel across separate tmux panes. Each pane displays an agent processing code, reading files, and making git commits independently while coordinated through a filesystem message queue.

A parallel agent runtime for your terminal. Run up to 20 AI coding agents simultaneously, each in its own tmux pane with its own context window and git branch. Works with any CLI agent that can take a prompt and write code. The sweet spot is 6โ€“8 agents (the default is 3; configurable with --count).

  • ๐Ÿ  Zero-infrastructure local swarm
  • ๐Ÿ–ฅ๏ธ tmux as execution surface
  • ๐Ÿ“‚ Filesystem as atomic message bus
  • ๐Ÿ‘€ Human-in-the-loop observability
  • ๐Ÿงฑ Simplicity over complexity โ€” no frameworks, no servers, no message brokers. The simpler the system, the more reliable the output.
  • ๐ŸŽฏ Shadow scoring โ€” quality defined before agents run, measured silently after

You've been doing AI coding one task at a time. Ask, wait, ask again, wait again. Terminal Stampede splits your terminal into multiple panes, drops an AI agent into each one, and lets them all charge through your codebase simultaneously. Each agent gets its own brain, its own branch, its own mission. You watch them work in real time through the gold โšก borders. Minutes later, everything's done.

Zero infrastructure. No Redis, no HTTP, no Docker, no cloud. Just files on disk and tmux.

Human in the loop, not after the fact. Every agent runs in a visible pane. Zoom in on any one, type into it, kill it, or just watch. Most multi-agent systems give you logs when it's over. This one puts you in the room while it's happening.

tmux is the runtime. Each pane is a full CLI agent session with its own context window. The filesystem is the message bus โ€” task claiming is an atomic file rename, no locks, no coordination server. Point it at any repo.

Works with any CLI agent. Built with GitHub Copilot CLI, but the pattern is tool-agnostic โ€” swap the agent command for Aider, Claude Code, or any CLI tool that can read a task and write code.

๐Ÿ“ Read the full story โ†’ "What If You Could Run 20 AI Agents in One Terminal?" โ€” How Havoc Hackathon, Shadow Score, Dark Factory, and Agent X-Ray led to this experiment.


๐Ÿš€ Quick Start

Prerequisites

  • macOS or Linux
  • tmux (brew install tmux)
  • A CLI coding agent (e.g., GitHub Copilot CLI, Aider, Claude Code)
  • python3, jq, openssl, git

Install

git clone https://github.com/DUBSOpenHub/terminal-stampede.git
cd terminal-stampede
chmod +x install.sh && ./install.sh

Six files land in their working locations:

File Location Purpose
Orchestrator skill ~/.copilot/skills/stampede/SKILL.md Parses commands, generates tasks, monitors, synthesizes
Worker agent ~/.copilot/agents/stampede-worker.agent.md Claims tasks, does the work, writes results
Merger agent ~/.copilot/agents/stampede-merger.agent.md Auto-merges all branches, resolves conflicts, shadow-scores
Launcher ~/bin/stampede.sh Creates tmux session, spawns panes, tracks PIDs
Monitor ~/bin/stampede-monitor.sh Live progress, stuck detection, runtime stats
Merger script ~/bin/stampede-merge.sh Discovers branches, sorts by size, launches merger

Note: The skill and agent files install to ~/.copilot/ paths for GitHub Copilot CLI. If you use a different CLI agent (Aider, Claude Code, etc.), you only need the shell scripts in ~/bin/ โ€” see Option B below.

Run

Option A: From the command line (works with any CLI agent)

Create task files yourself, then launch:

# 1. Create a run directory (inside your repo)
cd ~/my-project
RUN_ID="run-$(date +%Y%m%d-%H%M%S)"
mkdir -p .stampede/$RUN_ID/{queue,claimed,results,logs}

# 2. Add task files (one JSON per task)
cat > .stampede/$RUN_ID/queue/task-001.json << 'EOF'
{
  "task_id": "task-001",
  "description": "Add input validation to the auth module",
  "scope": ["src/auth.py"],
  "branch": "stampede/task-001"
}
EOF
# ... repeat for each task

# 3. Launch the fleet
stampede.sh --run-id $RUN_ID --count 8 --repo ~/my-project --model claude-haiku-4.5

A Terminal window opens. Eight panes tile across the screen. Gold โšก borders show the model and task for each agent. A monitor pane tracks progress in real time. You watch them work.

By default, workers launch with GitHub Copilot CLI. To use a different CLI agent, pass --agent-cmd:

# Claude Code
stampede.sh --run-id $RUN_ID --count 8 --repo ~/my-project --agent-cmd 'claude -p "{prompt}"'

# Aider
stampede.sh --run-id $RUN_ID --count 8 --repo ~/my-project --agent-cmd 'aider --message "{prompt}"'

Option B: From a Copilot CLI session (if using GitHub Copilot CLI)

Open a Copilot CLI session and tell the stampede skill what to do:

stampede 8 agents on ~/my-project โ€” add error handling, write tests, improve docs

The orchestrator reads your codebase, generates task files, launches the fleet, and monitors progress. You watch.


๐Ÿ“Š We Pointed It at Itself

To test Terminal Stampede, we pointed it at this repo. 8 agents ran simultaneously on the terminal-stampede codebase โ€” adding error handling, creating docs, improving the agent prompts, updating the changelog, and more. Nobody touched anything. They just ran.

Result
Tasks 8
Agents 8 (claude-haiku-4.5)
Wall clock ~6 minutes
Success rate 8/8
Coordination failures 0
Task Changes
Defensive error handling for stampede.sh +218 -33
CONTRIBUTING.md (from scratch) +219
Agent hard-exit rules +218 -33
Orchestrator failure recovery docs +132 -1
CHANGELOG update from git history +100
copilot-instructions.md improvements +85 -3
Blog accuracy review +30 -30
Install.sh: uninstall, --check, versioning +100

8 branches. ~800 lines of real changes. The simplest possible architecture โ€” files on disk, atomic renames, no coordination server โ€” was also the most reliable. Nothing broke. Nothing conflicted. The agents didn't even know each other existed.


๐Ÿ’ก The Problem

You're a developer. Monday morning. Your codebase needs error handling added to 4 modules, test coverage expanded, docs updated, and the CLI cleaned up. That's 8 tasks.

Today, you work through them one at a time. Ask your AI agent for the first task. Wait. Ask for the second. Wait. Context-switch. Lose momentum. Some tasks take a minute, some take ten, but you're stuck in a queue of your own making.

Terminal Stampede runs them all at once. One command, up to 20 panes, each agent working in parallel on its own git branch. Instead of feeding tasks one by one, you define the batch and let them run. Your development time scales with the longest single task, not the sum of all of them.

Sequential Parallel (Stampede)
Workflow One task at a time All tasks at once
Context windows One shared session Up to 20 independent sessions
Git branches 1 (sequential) Up to 20 (parallel, isolated)
Your involvement Babysit each task Start it and walk away

๐Ÿค” What Is This?

Every multi-agent framework out there (LangGraph, CrewAI, AutoGen) runs agents as function calls inside one process. They share one brain. When Agent A is thinking, Agent B waits.

Terminal Stampede does something different. Each agent is a fully independent CLI session running in its own tmux pane with its own context window. It can read code, edit files, run tests, see failures, and fix them. No other agent is competing for its attention.

Each agent = one tmux pane = one independent CLI session = one git branch. They share nothing โ€” no memory, no context, no files in progress. Twenty agents means twenty completely isolated AI coding sessions running side by side.

Branches are named stampede/task-001, stampede/task-002, etc. After a run, the merger combines them into stampede/merged-{run_id}. Task branches stay around for inspection until you clean up with --teardown.

The "message queue" is just files on disk. The "orchestrator" is just a script. The "agent runtime" is just your terminal. Point it at any repo.


๐Ÿ—บ๏ธ How It Works

Think of a deli counter. Tasks are tickets on the wall. Agents grab one at a time.

Task claiming (race-safe)

Agent A: mv queue/task-001.json claimed/task-001.json  โ† succeeds
Agent B: mv queue/task-001.json claimed/task-001.json  โ† file gone, tries next

No locks. No database. Just filesystem rename โ€” atomic by POSIX guarantee.

Each agent works alone

  1. Claim a task (atomic mv)
  2. Create git branch: stampede/task-001
  3. Read the code, make improvements, run tests
  4. Write result file (atomic: .tmp- then mv)
  5. Claim next task or exit

The orchestrator watches

โš™๏ธ [โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘] 75% (6/8) | alive=8 dead=0

If an agent dies mid-task, the orchestrator detects it via PID check, re-queues the task, and another agent picks it up.

Conflict detection

When all results are in, the orchestrator checks if any two agents modified the same file:

โš ๏ธ CONFLICT: lib/state.py modified by task-001 and task-003
โœ… No conflicts on remaining 6 branches โ€” ready to merge

Auto-merge with shadow scoring

"Did you define what good looks like before AI ran, or after?" Most people using AI coding tools have no definition of quality โ€” they eyeball the output and hope for the best. Stampede bakes evaluation into the runtime itself. The scoring criteria are defined before agents run. Measurement happens silently during and after. The agents never know they're being scored.

After all agents finish, the merger agent combines every branch into one. It merges sequentially (smallest changes first to build a clean base), resolves conflicts using AI that reads both task descriptions to understand intent, and skips anything irreconcilable.

While merging, the merger silently shadow-scores each agent's work across 3 layers:

Layer When What It Measures
Runtime During stampede Time to complete, stuck events, files changed
Merge During merge Conflict friendliness (clean merge vs. conflicts caused)
Quality After all merges Completeness, scope adherence, code quality, test impact

Scores are weighted โ€” Completeness (30%) matters most, Conflict Friendliness (10%) matters least since it's partly outside the agent's control. The agents never know they're being scored.

๐Ÿฆฌ Shadow Scorecard (weighted)
โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
 Agent       Model              Comp  Scope  Qual  Conflt  Test   Total  +/-
                                (30%)  (25%) (20%)  (10%) (15%)   /50
 โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
 task-001    claude-sonnet-4.5    10     10     8     10     5    44.2  โšก+2
 task-002    gpt-5.1              10     10     8     10     5    44.2
 task-003    claude-sonnet-4.5    10     10     8     10     5    44.2  ๐ŸŒ-1
โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•

Scores persist across runs to ~/.stampede/model-stats.json, building a leaderboard that shows which AI models consistently produce the best work over time.

Which AI model is actually best for your codebase?

Every vendor publishes benchmarks. Every benchmark uses synthetic tests. None of them tell you which model writes the best code on your repo, with your patterns, in your language.

The stampede leaderboard answers that question empirically. Every run shadow-scores each model's work. Scores accumulate across runs. Over time, you get a ranking built from real work on your real codebase โ€” not from HumanEval, not from vendor marketing, not from someone else's synthetic tests. From your code, your tasks, your results.

๐Ÿ“Š Model Leaderboard (12 runs)
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
  claude-sonnet-4.5         avg 44.2/50  (18 branches)
  gpt-5.1-codex             avg 42.8/50  (14 branches)
  claude-haiku-4.5          avg 41.1/50  (22 branches)
  gpt-5.1                   avg 39.7/50  (16 branches)
  gemini-3-pro              avg 38.4/50  (10 branches)

๐Ÿ“ˆ Model stats updated (12 total runs)

๐Ÿ‡ Usage

stampede.sh --run-id <id> --count <n> --repo <path> [--model <model>] [--agent-cmd <cmd>]
stampede.sh --teardown --run-id <id>

Options:
  --run-id      Run identifier (format: run-YYYYMMDD-HHMMSS)
  --count       Number of agents (1-20, sweet spot: 6-8)
  --repo        Path to any git repository
  --model       AI model (default: claude-haiku-4.5)
  --agent-cmd   Custom CLI agent command (default: GitHub Copilot CLI)
                Use {prompt} and {model} as placeholders.
  --teardown    Kill agents, clean up
  --no-attach   Don't auto-open Terminal window

๐ŸŽฎ Tmux Navigation

Key What it does
tmux attach -t stampede-{run_id} Attach to the fleet
Ctrl-B z Zoom one pane full screen
Ctrl-B z again Zoom back out to the grid
Ctrl-B arrow Move between panes
Ctrl-B d Detach (agents keep running)

๐Ÿ’ฌ Zoom into any pane and talk to the agent mid-task. Every pane is a live session โ€” watch, redirect, or course-correct while the stampede runs.

๐Ÿ–ฅ๏ธ Best on ultrawide. 20 agents on a 49" ultrawide gives each one the space of a normal terminal. One monitor, 20 AI brains, all visible at once.


๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Orchestrator (SKILL.md)                        โ”‚
โ”‚  Parses intent โ†’ generates tasks โ†’ launches     โ”‚
โ”‚  agents โ†’ polls results โ†’ synthesizes           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
            โ”‚
            โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Launcher (stampede.sh)                         โ”‚
โ”‚  tmux session โ†’ N panes โ†’ PID tracking          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
        โ–ผ       โ–ผ       โ–ผ       โ–ผ
     โ”Œโ”€โ”€โ”€โ”€โ”€โ”โ”Œโ”€โ”€โ”€โ”€โ”€โ”โ”Œโ”€โ”€โ”€โ”€โ”€โ”โ”Œโ”€โ”€โ”€โ”€โ”€โ”
     โ”‚  ๐Ÿฆฌ  โ”‚โ”‚  ๐Ÿฆฌ  โ”‚โ”‚  ๐Ÿฆฌ  โ”‚โ”‚  ๐Ÿฆฌ  โ”‚  Each agent: own terminal,
     โ”‚     โ”‚โ”‚     โ”‚โ”‚     โ”‚โ”‚     โ”‚  own context window, own branch
     โ””โ”€โ”€โ”ฌโ”€โ”€โ”˜โ””โ”€โ”€โ”ฌโ”€โ”€โ”˜โ””โ”€โ”€โ”ฌโ”€โ”€โ”˜โ””โ”€โ”€โ”ฌโ”€โ”€โ”˜
        โ”‚      โ”‚      โ”‚      โ”‚
        โ–ผ      โ–ผ      โ–ผ      โ–ผ
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚  repo/.stampede/{run_id}/       โ”‚
   โ”‚  queue/ โ†’ claimed/ โ†’ results/  โ”‚
   โ”‚                                 โ”‚
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                   โ”‚ all done
                   โ–ผ
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚  Merger (stampede-merger)       โ”‚
   โ”‚  Auto-merge โ†’ resolve conflictsโ”‚
   โ”‚  โ†’ shadow score โ†’ leaderboard  โ”‚
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿง  Design Decisions

Decision Why
Filesystem as message queue Simpler than anything else. ls queue/ is your debugger
Agents for tasks, skill for orchestrator Skills load globally, agents load per-session. Clean role isolation. Skill/agent format is Copilot CLI; shell scripts work with any CLI agent
Branch per task No two agents touch main. Conflicts caught at synthesis
Auto-merger with AI conflict resolution Reads both task descriptions to resolve conflicts semantically, not just syntactically
Weighted shadow scoring Completeness (30%) matters most; conflict friendliness (10%) is partly luck
Cross-run model leaderboard Shows which AI models consistently produce the best work over time
500-word result cap Verbose summaries would blow the orchestrator's context
--max-autopilot-continues 30 Prevents runaway agents from burning unlimited quota (Copilot CLI flag; other CLIs have their own limits)
Lightweight models for grunt work Save the powerful model for synthesis, use fast ones for parallel tasks

๐Ÿฆฌ Origin

Built during a Havoc Hackathon, where AI models competed to design this framework across elimination rounds with sealed judging. The winning architecture was synthesized from Claude Opus 4.6 (Fast) and GPT-5.3-Codex, then battle-tested with live stampedes on real codebases.

Read the full story: I Split One Terminal Into 20 AI Brains. Here's What Happened. โ†’

๐Ÿ“„ License

MIT โ€” use it, fork it, stampede with it. ๐Ÿฆฌ


๐Ÿ™ Built with Love

Created with ๐Ÿ’œ by DUBSOpenHub. Works with any CLI coding agent.

Let's build! ๐Ÿš€โœจ

About

A parallel agent runtime for your terminal. Run up to 20 AI coding agents simultaneously in tmux panes. Works with any CLI agent.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors