Agent Teams Eval (ate)

Experimental comparison of Claude Code with Agent Teams (symmetric peers) vs default Claude Code (hub-and-spoke subagents) for bug triage and fix on the Ruff Python linter (Rust codebase).

Key finding: Ceiling effect — Claude Opus 4.6 solves all 8 bugs (8/8) in under 10 minutes regardless of treatment condition. Zero peer-to-peer communication observed across all team treatments. Agent Teams functions as a parallelism engine, not a collaboration tool, for tasks within the model's capability frontier.

First in the ate-series. Successors: ate-features (feature implementation), ate-arch (architecture design).

Results at a Glance

Treatment	Fix Rate	Mean Time	Peer Messages
0b (solo control)	8/8	49 min	N/A
2a (2-agent team)	8/8	16 min	0
5 (max parallelism)	8/8	9.5 min	0

See findings for full analysis and experiment-design.md for the protocol.

Quick Start

uv sync --group dev
uv run ate bugs list
uv run ate treatments list

Built On

Claude Code — agentic coding tool
Agent Teams — multi-agent collaboration feature under study
Subagents — the default hub-and-spoke delegation mechanism

Validation Gates

make test       # Unit tests (162)
make test-int   # Integration tests (requires built Ruff)
make lint       # Ruff linter
make typecheck  # mypy strict

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.claude/agents		.claude/agents
config		config
data		data
docs		docs
scripts		scripts
src/ate		src/ate
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent Teams Eval (ate)

Results at a Glance

Quick Start

Built On

Validation Gates

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

kar-ganap/ate

Folders and files

Latest commit

History

Repository files navigation

Agent Teams Eval (ate)

Results at a Glance

Quick Start

Built On

Validation Gates

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages