Add checkpoint discovery hints to agent prompts by james-in-a-box[bot] · Pull Request #894 · jwbron/egg

james-in-a-box · 2026-02-23T08:29:07Z

Improve checkpoint discoverability for agents

Agents rarely use egg-checkpoint because nothing in their prompts tells them to.
The CLI works, documentation exists, and a Claude Code rule is present — but agents
only discover checkpoints if they happen to read the right docs. This change surfaces
checkpoint commands directly in agent prompts and mode documentation so every agent
session automatically knows how to query prior work.

Changes:

Orchestrator prompts (orchestrator/routes/pipelines.py): Added checkpoint
discovery hints to _build_role_context, _build_agent_prompt (tester, documenter,
integrator roles), and _build_phase_scoped_prompt (revision cycles with failed
session lookup).
Agent mode commands: Added checkpoint browsing sections to tester-mode.md,
documenter-mode.md, integrator-mode.md, and coder-mode.md (revision cycle).
Rules: Updated checkpoint.md with a "When to Use" section mapping each role
to its most useful checkpoint commands. Updated mission.md to list checkpoints
as a context source and mention them in the gather-context workflow.
Documentation: Expanded docs/guides/checkpoint-access.md with documenter and
coder-revision workflow examples, plus cost command for integrators.
Tests: Added test_checkpoint_discovery.py (460 lines) verifying all prompt
hints are present, with a conftest.py fix for sys.modules mock isolation.

Closes #887

Issue: #887

Test plan:

Run pytest orchestrator/tests/test_checkpoint_discovery.py — all tests pass
Verify tester/documenter/integrator prompts contain egg-checkpoint commands
Verify revision-cycle prompt includes failed session lookup hint
Check agent mode .md files contain checkpoint sections

Authored-by: egg

Container ecb438f293a0195cad54e4f5fe6dc8c1371af55ce05af32dd1dfefbe82427861 exited with uncommitted changes. This commit preserves the agent's work-in-progress. Authored-by: egg

Container 302feb3d8b2a3c5204f2f8d40b381654e890ed79dd73f4a45e788f1bc5531258 exited with uncommitted changes. This commit preserves the agent's work-in-progress. Authored-by: egg

Add role-specific checkpoint discovery hints at three levels: 1. Orchestrator prompts (auto-injected into every agent session): - _build_role_context(): checkpoint pointer in For More Context section - _build_agent_prompt(): tester gets coder checkpoint list, documenter gets context --files, integrator gets context and cost commands - _build_phase_scoped_prompt(): revision checklist gets failed-session hint 2. Agent mode commands (supplementary reference): - tester-mode.md: Review Prior Work section - integrator-mode.md: Pipeline Overview section - documenter-mode.md: Find Changed Files section - coder-mode.md: Revision Cycle Context section 3. Mission and checkpoint rules (baseline awareness): - mission.md: Checkpoints row in context sources table, checkpoint hint in Gather context workflow step - checkpoint.md: When to Use section with role-specific guidance All changes are additive text. No logic changes, no schema migrations. All 1331 orchestrator tests pass.

Import the real docker module in orchestrator/tests/conftest.py before test collection so that modules using sys.modules.setdefault("docker", MagicMock()) don't shadow the real package. This prevents docker_client.NotFound et al. from being bound to MagicMock objects which aren't BaseException subclasses and break except clauses.

egg-reviewer

No agent-mode design concerns. The changes are well-aligned with the agent-mode design guidelines.

The PR adds short checkpoint discovery hints (1-3 lines each) to orchestrator prompts, agent mode commands, and rules. These are orientation, not constraint — they tell agents when to use an existing tool (egg-checkpoint) without pre-fetching data, requiring structured output, or micromanaging procedure. The hints use suggestive language ("Before writing tests, review...") rather than imperative mandates, and reference the existing checkpoint.md rule for details rather than duplicating content. Analysis roles are correctly excluded.

— Authored by egg

egg-reviewer

Code Review: Add checkpoint discovery hints to agent prompts

Verdict: Approve (with minor suggestions)

Low-risk PR that adds text-only checkpoint discovery hints to orchestrator prompts, agent mode commands, rules, and documentation. No logic changes, no schema migrations. The approach is sound — addressing discoverability at the highest-leverage injection points (orchestrator prompts) rather than over-engineering with new tooling.

Issues Found

1. Command inconsistency: tester-mode.md omits --phase implement (non-blocking)

orchestrator/routes/pipelines.py:2485 uses:

egg-checkpoint list --pipeline $EGG_PIPELINE_ID --agent-type coder --phase implement

sandbox/.claude/commands/tester-mode.md:92 uses:

egg-checkpoint list --pipeline $EGG_PIPELINE_ID --agent-type coder

The --phase implement flag is omitted from tester-mode.md. The pipelines.py version is more precise — it filters to the implement phase, which is what the tester actually needs. The tester-mode.md version would return checkpoints from all phases (including plan/refine), which is rarely useful. The code reviewer for this PR noted the same inconsistency. The checkpoint.md rule (line 9) also omits --phase implement for the tester use case.

Suggestion: Add --phase implement to both tester-mode.md and the tester entry in checkpoint.md to match pipelines.py.

2. conftest.py docker import fix — correct but worth a note (non-blocking)

The conftest.py change (orchestrator/tests/conftest.py:20-28) pre-imports the real docker module before test collection to prevent sys.modules.setdefault("docker", MagicMock()) calls from shadowing it. This is the correct fix for the test_docker_client.py failures caused by mock pollution from test_checkpoint_discovery.py.

The fix is sound and well-documented with a clear comment explaining the rationale. The try/except ImportError handles environments where docker isn't installed. This is a legitimate fix to a pre-existing fragility in the test suite's module mocking strategy.

3. Test coverage is solid (positive)

460 lines of tests in test_checkpoint_discovery.py covering:

Positive cases for all execution roles (tester, documenter, integrator)
Negative cases for analysis roles (architect, task_planner, risk_analyst)
Edge cases (None prompt, None issue_number, empty feedback, cycle 0)
Revision checklist gating on review_cycle > 0 and review_feedback
End-to-end tests verifying both general and role-specific hints
Ordering assertions (checkpoint hint appears after gap-finding section)

The test file follows the same docker mock pattern as test_pipeline_prompts.py.

No Blocking Issues

The PR is well-scoped and correctly implements the prompt-only approach (Option A). Specifically:

Orchestrator prompt changes (pipelines.py): Checkpoint hints are injected at the right locations — _build_role_context for the general pointer, _build_agent_prompt for role-specific commands, and _build_phase_scoped_prompt for revision cycle failed-session hints. Analysis roles are correctly excluded via the early return at line 1225.
Agent mode commands: Each role gets a contextually appropriate section. Coder-mode correctly scopes the hint to revision cycles only.
Rules updates: checkpoint.md gets a "When to Use" section mapping roles to commands. mission.md adds checkpoints to the context sources table and workflow.
Documentation: checkpoint-access.md adds documenter and coder-revision workflow examples. sdlc-pipeline.md lists checkpoints as a 4th context persistence mechanism.
.egg-state files: Pipeline state artifacts are properly updated for issue #887. These are expected tracking files.

The hint language is appropriately suggestive ("Before writing tests, review..." / "Find all changed files...") rather than imperative, which should minimize unnecessary checkpoint queries on first-cycle runs.

— Authored by egg

james-in-a-box

Contract Verification: PASS

All 14 tasks across 4 phases verified against the contract for issue #887.

Phase 1: Orchestrator prompt injection (5/5 tasks verified)

Task	Acceptance Criteria	Status
task-1-1	`_build_role_context()` includes `egg-checkpoint context --pipeline` for all execution roles	Verified — line appended to "For More Context" section, analysis roles correctly excluded
task-1-2	Tester prompt includes `egg-checkpoint list --pipeline $EGG_PIPELINE_ID --agent-type coder --phase implement`	Verified — exact command present in tester section
task-1-3	Documenter prompt includes `egg-checkpoint context --pipeline $EGG_PIPELINE_ID --files`	Verified — exact command present in documenter section
task-1-4	Integrator prompt includes `egg-checkpoint context` and `egg-checkpoint cost` commands	Verified — both commands present in integrator section
task-1-5	Revision checklist includes `egg-checkpoint list --issue $EGG_ISSUE_NUMBER --status failed` when review_cycle > 0	Verified — added to revision checklist, correctly absent for cycle 0 and no-feedback cases

Phase 2: Agent mode command updates (4/4 tasks verified)

Task	Acceptance Criteria	Status
task-2-1	tester-mode.md contains `## Review Prior Work` with egg-checkpoint list command	Verified
task-2-2	integrator-mode.md contains `## Pipeline Overview` with context and cost commands	Verified
task-2-3	documenter-mode.md contains `## Find Changed Files` with context --files command	Verified
task-2-4	coder-mode.md contains `## Revision Cycle Context` with --status failed command	Verified

Phase 3: Mission rule and checkpoint rule updates (3/3 tasks verified)

Task	Acceptance Criteria	Status
task-3-1	mission.md context sources table includes Checkpoints row	Verified — row added with `egg-checkpoint` CLI location
task-3-2	mission.md "Gather context" mentions egg-checkpoint for multi-agent pipelines	Verified — appended to Gather context step
task-3-3	checkpoint.md has "When to Use" section for tester, documenter, integrator, coder (revision)	Verified — all four roles listed with commands

Phase 4: Verification (2/2 tasks verified)

Task	Acceptance Criteria	Status
task-4-1	All existing tests pass	Verified — 6683 passed, 2 pre-existing failures (test_docker_client.py, unrelated)
task-4-2	Checkpoint hints appear in prompt outputs for each role	Verified — 29 new tests in test_checkpoint_discovery.py cover all roles

Additional Changes (outside contract scope but consistent)

docs/guides/checkpoint-access.md: Added documenter and coder-revision workflow examples, updated tester example to use --pipeline flag, added cost command to integrator example
docs/guides/sdlc-pipeline.md: Added checkpoints as 4th context persistence mechanism and checkpoint discovery hints bullet
sandbox/.claude/README.md: Added checkpoint.md to core rules listing
orchestrator/tests/conftest.py: Fixed docker_client test failures from sys.modules mock pollution (import-ordering fix)
orchestrator/tests/test_checkpoint_discovery.py: 460-line test file with 29 tests covering all prompt-building functions

Contract Notes

The contract's top-level acceptance_criteria array is empty — all criteria are embedded per-task. Could not use egg-contract verify-criterion due to this structure.
All task statuses in the contract remain "pending" (not updated by the coder/integrator), but the implementation is complete and verified.
No contract violations found.
No orphaned code outside contract scope.
All changes are additive text — no logic changes, no schema migrations.

Verdict: Approve

All 14 acceptance criteria are met. Implementation is clean, well-tested, and consistent across all injection points. No gaps or violations found.

— Authored by egg

Addresses review feedback: the tester checkpoint list command in tester-mode.md and checkpoint.md was missing --phase implement, which would return checkpoints from all phases instead of just the implement phase that testers actually need. Now matches the authoritative version in orchestrator/routes/pipelines.py.

james-in-a-box

Agreed. Added --phase implement to both sandbox/.claude/commands/tester-mode.md and sandbox/.claude/rules/checkpoint.md tester entries to match the authoritative version in orchestrator/routes/pipelines.py:2485. The tester only needs implement-phase checkpoints, so filtering by phase avoids returning plan/refine noise.

Pushed in 6f5a4f2.

— Authored by egg

egg-reviewer

Re-reviewed the delta since b49d797. The only change is 6f5a4f2 which adds --phase implement to tester checkpoint commands in tester-mode.md and checkpoint.md, addressing the inconsistency flagged in the previous review. No agent-mode design concerns.

— Authored by egg

james-in-a-box

Contract Verification: Re-review (commit `6f5a4f2`)

Verdict: PASS — All contract tasks verified.

Note: The branch was force-pushed since the previous review at b49d7978 (commit no longer in history). Full re-verification was performed.

Phase 1: Orchestrator prompt injection ✓

Task	Acceptance Criteria	Status
task-1-1	`_build_role_context()` includes `egg-checkpoint context --pipeline` for all execution roles	Verified — Added at L1289+, analysis roles excluded by early return
task-1-2	Tester prompt includes `egg-checkpoint list --pipeline $EGG_PIPELINE_ID --agent-type coder --phase implement`	Verified — Added in tester section of `_build_agent_prompt()`
task-1-3	Documenter prompt includes `egg-checkpoint context --pipeline $EGG_PIPELINE_ID --files`	Verified — Added in documenter section
task-1-4	Integrator prompt includes `egg-checkpoint context` and `egg-checkpoint cost` commands	Verified — Both commands added in integrator section
task-1-5	Revision checklist includes failed-session hint when `review_cycle > 0`	Verified — Added inside `if review_feedback and review_cycle > 0:` guard

Phase 2: Agent mode command updates ✓

Task	Acceptance Criteria	Status
task-2-1	`tester-mode.md` has `## Review Prior Work` with `egg-checkpoint list`	Verified
task-2-2	`integrator-mode.md` has `## Pipeline Overview` with context + cost commands	Verified
task-2-3	`documenter-mode.md` has `## Find Changed Files` with context --files	Verified
task-2-4	`coder-mode.md` has `## Revision Cycle Context` with --status failed	Verified

Phase 3: Mission rule and checkpoint rule updates ✓

Task	Acceptance Criteria	Status
task-3-1	`mission.md` context sources table includes Checkpoints row	Verified
task-3-2	`mission.md` "Gather context" step mentions `egg-checkpoint context`	Verified
task-3-3	`checkpoint.md` has "When to Use" with tester/documenter/integrator/coder guidance	Verified — All four roles listed with role-specific commands

Phase 4: Verification ✓

Task	Acceptance Criteria	Status
task-4-1	All existing tests pass	Verified — 6685 passed, 85 skipped. 2 pre-existing failures in `test_docker_client.py` (on main)
task-4-2	Checkpoint hints confirmed in prompts for all roles	Verified — 29 new tests in `test_checkpoint_discovery.py` covering all roles and edge cases

Additional Changes (non-contract)

orchestrator/tests/conftest.py: Docker import fix to prevent sys.modules mock pollution — correctly addresses a pre-existing test ordering issue
docs/guides/checkpoint-access.md: Added documenter and coder-revision workflow examples, updated tester example to use --pipeline, added cost command to integrator example
docs/guides/sdlc-pipeline.md: Added checkpoints as 4th context persistence mechanism
sandbox/.claude/README.md: Added checkpoint.md to core rules listing

All additional changes are consistent with the contract's intent and do not violate any criteria.

Contract Integrity

No regressions on previously verified criteria (force-push, so full re-verification done)
No orphaned code outside contract scope
No contract violations found
All 14 task acceptance criteria met

— Authored by egg

egg-reviewer

Re-Review: Add checkpoint discovery hints to agent prompts

Delta reviewed: b49d79788d..6f5a4f2 (feedback-addressing commit)

Previous Feedback: Addressed

The --phase implement inconsistency has been fixed correctly. The tester checkpoint command now includes --phase implement in all three locations:

sandbox/.claude/commands/tester-mode.md — fixed
sandbox/.claude/rules/checkpoint.md — fixed
docs/guides/checkpoint-access.md — already correct

All five injection points (pipelines.py, tester-mode.md, checkpoint.md, checkpoint-access.md, and the test assertions) now use the same command: egg-checkpoint list --pipeline $EGG_PIPELINE_ID --agent-type coder --phase implement.

Full Review Summary

Reviewed all 21 changed files across the PR. No blocking issues found.

Orchestrator prompt changes (pipelines.py): Five clean insertions across _build_role_context, _build_agent_prompt (tester/documenter/integrator), and _build_phase_scoped_prompt. Analysis roles are correctly excluded via the early return at line 1225. Checkpoint hints are placed at the right location in each section — after the relevant task context, not before. The revision checklist hint is properly gated behind review_cycle > 0 and review_feedback.

Agent mode commands (.md files): Each role gets a contextually appropriate section placed before the Quality Checklist. Commands match the authoritative versions in pipelines.py.

Rules and docs: checkpoint.md "When to Use" section, mission.md context sources table, and checkpoint-access.md workflow examples are all consistent with each other and with the orchestrator prompts.

Tests (test_checkpoint_discovery.py): 29 tests covering positive cases for all execution roles, negative cases for all analysis roles, edge cases (None prompt, None issue_number, empty feedback, cycle 0), ordering assertions, and end-to-end flows. Test structure follows the existing test_pipeline_prompts.py pattern.

conftest.py fix: Pre-importing docker before test collection prevents sys.modules.setdefault("docker", MagicMock()) from shadowing the real package. Correct fix with proper try/except ImportError guard. This addresses a real test isolation issue.

No security, correctness, robustness, or design issues identified. All changes are additive prompt text — no logic modifications, no schema changes, no breaking changes.

— Authored by egg

james-in-a-box · 2026-02-23T08:49:55Z

egg is addressing review feedback...

james-in-a-box · 2026-02-23T08:51:54Z

egg feedback addressed. View run logs

7 previous review(s) hidden.

Resolved conflicts in two globally-named .egg-state files that were written by different pipelines (issue #871 vs #887) — the exact class of conflict this PR is designed to prevent: - .egg-state/agent-outputs/risk_analyst-output.json: kept PR branch version (issue #871 risk assessment) - .egg-state/checks/implement-results.json: kept PR branch version (issue #871 check results) Both files are ephemeral pipeline state artifacts. Main's versions were from a different pipeline run (issue #887, PR #894).

egg-orchestrator added 14 commits February 23, 2026 07:16

Initialize SDLC contract for issue #887

85969d2

Add analysis for checkpoint discoverability (#887)

356f66a

Add implementation plan for checkpoint discoverability (#887)

66ca725

WIP: auto-commit uncommitted work (task_planner) [issue-887]

5f00d51

Container ecb438f293a0195cad54e4f5fe6dc8c1371af55ce05af32dd1dfefbe82427861 exited with uncommitted changes. This commit preserves the agent's work-in-progress. Authored-by: egg

Add risk assessment for checkpoint discoverability (#887)

e205435

WIP: auto-commit uncommitted work (risk_analyst) [issue-887]

ccff265

Container 302feb3d8b2a3c5204f2f8d40b381654e890ed79dd73f4a45e788f1bc5531258 exited with uncommitted changes. This commit preserves the agent's work-in-progress. Authored-by: egg

Add plan review verdict for #887 (approved)

f07c29b

Persist statefiles after plan phase

03637fc

docs: Update documentation for checkpoint discoverability

8f9e12b

Add tests for checkpoint discovery hints in agent prompts

3c31be5

Fix lint/format in checkpoint discovery tests

161a24e

Persist statefiles after implement phase

b49d797

egg-reviewer bot approved these changes Feb 23, 2026

View reviewed changes

This comment has been minimized.

Sign in to view

egg-reviewer bot reviewed Feb 23, 2026

View reviewed changes

james-in-a-box bot commented Feb 23, 2026

View reviewed changes

This comment has been minimized.

Sign in to view

james-in-a-box bot commented Feb 23, 2026

View reviewed changes

egg-reviewer bot approved these changes Feb 23, 2026

View reviewed changes

This comment has been minimized.

Sign in to view

james-in-a-box bot commented Feb 23, 2026

View reviewed changes

egg-reviewer bot reviewed Feb 23, 2026

View reviewed changes

This comment has been minimized.

Sign in to view

jwbron approved these changes Feb 23, 2026

View reviewed changes

jwbron merged commit 3dd7f51 into main Feb 23, 2026
39 of 40 checks passed

james-in-a-box bot mentioned this pull request Feb 23, 2026

Namespace .egg-state files per-pipeline to prevent merge conflicts #895

Merged

Comments

Conversation

james-in-a-box bot commented Feb 23, 2026

Uh oh!

egg-reviewer bot left a comment

Choose a reason for hiding this comment

Uh oh!

This comment has been minimized.

egg-reviewer bot left a comment

Choose a reason for hiding this comment

Code Review: Add checkpoint discovery hints to agent prompts

Verdict: Approve (with minor suggestions)

Issues Found

No Blocking Issues

Uh oh!

james-in-a-box bot left a comment

Choose a reason for hiding this comment

Contract Verification: PASS

Phase 1: Orchestrator prompt injection (5/5 tasks verified)

Phase 2: Agent mode command updates (4/4 tasks verified)

Phase 3: Mission rule and checkpoint rule updates (3/3 tasks verified)

Phase 4: Verification (2/2 tasks verified)

Additional Changes (outside contract scope but consistent)

Contract Notes

Verdict: Approve

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

james-in-a-box bot left a comment

Choose a reason for hiding this comment

Uh oh!

egg-reviewer bot left a comment

Choose a reason for hiding this comment

Uh oh!

This comment has been minimized.

james-in-a-box bot left a comment

Choose a reason for hiding this comment

Contract Verification: Re-review (commit 6f5a4f2)

Phase 1: Orchestrator prompt injection ✓

Phase 2: Agent mode command updates ✓

Phase 3: Mission rule and checkpoint rule updates ✓

Phase 4: Verification ✓

Additional Changes (non-contract)

Contract Integrity

Uh oh!

egg-reviewer bot left a comment

Choose a reason for hiding this comment

Re-Review: Add checkpoint discovery hints to agent prompts

Previous Feedback: Addressed

Full Review Summary

Uh oh!

This comment has been minimized.

This comment has been minimized.

james-in-a-box bot commented Feb 23, 2026

Uh oh!

james-in-a-box bot commented Feb 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Contract Verification: Re-review (commit `6f5a4f2`)