Note
Blacksmith is under active development and the API surface is not yet stable. We're iterating quickly — expect breaking changes in early releases. That said, it's fully functional today and we'd love for you to try it out.
A supervised agent harness that runs AI coding agents in a loop — dispatching prompts, monitoring sessions, enforcing health invariants, collecting metrics, and repeating.
Currently, blacksmith depends on your using bd to record tasks you would like accomplished.
beads - https://github.com/steveyegge/beads
blacksmith init currently depends on claude existing and being setup.
curl -fsSL https://raw.githubusercontent.com/ozten/blacksmith/main/scripts/install.sh | bashTo install a specific version:
BLACKSMITH_VERSION=0.1.0 curl -fsSL https://raw.githubusercontent.com/ozten/blacksmith/main/scripts/install.sh | bashInitialize blacksmith in your project:
cd your-project
blacksmith initThis creates a .blacksmith/ directory with a default config.toml and a PROMPT.md template. Edit PROMPT.md with instructions for your agent, then start the loop:
blacksmithBlacksmith runs claude by default for up to 25 productive iterations, monitoring for stale sessions, retrying empty outputs, and handling rate limits with exponential backoff.
See docs/getting-started.md for a full walkthrough.
Edit .blacksmith/config.toml to customize behavior:
[agent]
command = "claude" # Or "codex", "aider", "opencode", etc.
[workers]
max = 3 # Concurrent workers (1 = serial mode)
[session]
max_iterations = 25Blacksmith supports multiple AI agents — see Agent Adapters for Claude, Codex, OpenCode, Aider, and Raw adapter details.
For the full configuration reference, see docs/configuration.md.
Launch a metrics server in each project directory:
blacksmith serve &Then launch the multi-project dashboard once:
blacksmith-uiThis opens a browser dashboard that monitors all running blacksmith projects — workers, progress, metrics, and session health — from a single view.
- Supervised loop — session lifecycle, watchdog, retry, exponential backoff, graceful shutdown
- Multi-agent — parallel workers in git worktrees with conflict-aware scheduling and sequential integration
- Metrics — per-session event storage, custom extraction rules, performance briefs, targets
- Institutional memory — improvement tracking with two-speed feedback (DB → prompt promotion)
- Agent adapters — Claude, Codex, OpenCode, Aider, Raw — with graceful metric degradation
- Architecture analysis — fan-in detection, god files, circular deps, automated refactor proposals
- Deployment model — embedded defaults,
blacksmith init, quality-gatedbd-finish.sh - Language-agnostic — configurable quality gates for Rust, TypeScript, Go, Python, etc.
Full documentation lives in docs/:
- Getting Started
- CLI Reference
- Configuration Reference
- Core Loop
- Metrics & Improvements
- Agent Adapters
- Multi-Agent Coordination
- Deployment Model
- Architecture Analysis
- Troubleshooting
# Build from source
cargo build --release --workspace
# Run tests
cargo test