Experimental comparison of Claude Code with Agent Teams (symmetric peers) vs default Claude Code (hub-and-spoke subagents) for bug triage and fix on the Ruff Python linter (Rust codebase).
Key finding: Ceiling effect — Claude Opus 4.6 solves all 8 bugs (8/8) in under 10 minutes regardless of treatment condition. Zero peer-to-peer communication observed across all team treatments. Agent Teams functions as a parallelism engine, not a collaboration tool, for tasks within the model's capability frontier.
First in the ate-series. Successors: ate-features (feature implementation), ate-arch (architecture design).
| Treatment | Fix Rate | Mean Time | Peer Messages |
|---|---|---|---|
| 0b (solo control) | 8/8 | 49 min | N/A |
| 2a (2-agent team) | 8/8 | 16 min | 0 |
| 5 (max parallelism) | 8/8 | 9.5 min | 0 |
See findings for full analysis and experiment-design.md for the protocol.
uv sync --group dev
uv run ate bugs list
uv run ate treatments list- Claude Code — agentic coding tool
- Agent Teams — multi-agent collaboration feature under study
- Subagents — the default hub-and-spoke delegation mechanism
make test # Unit tests (162)
make test-int # Integration tests (requires built Ruff)
make lint # Ruff linter
make typecheck # mypy strict