Skip to content

Comments

Namespace .egg-state files per-pipeline to prevent merge conflicts#895

Merged
jwbron merged 30 commits intomainfrom
egg/issue-871/work
Feb 23, 2026
Merged

Namespace .egg-state files per-pipeline to prevent merge conflicts#895
jwbron merged 30 commits intomainfrom
egg/issue-871/work

Conversation

@james-in-a-box
Copy link
Contributor

@james-in-a-box james-in-a-box bot commented Feb 23, 2026

Namespace .egg-state files per-pipeline to prevent merge conflicts

Context

When multiple pipelines run concurrently on different branches, globally-named files
in .egg-state/ cause merge conflicts. PR #869 had 4 conflicts in
agent-outputs/architect-output.json, agent-outputs/integrator-output.json,
agent-outputs/risk_analyst-output.json, and checks/implement-results.json — all
storing issue-scoped data at global paths.

Changes

Prefixes the remaining globally-named .egg-state/ files with the issue number,
matching the convention already used by contracts/, drafts/, and reviews/:

  • agent-outputs/{role}-output.jsonagent-outputs/{issue}-{role}-output.json
  • checks/implement-results.jsonchecks/{issue}-implement-results.json

Updated across all writers and readers:

  • orchestrator: dispatch.py, handoffs.py, routes/pipelines.py,
    routes/signals.py, health_checks/tier1/phase_output.py
  • shared: egg_contracts/orchestrator.py — added identifier parameter to
    load_agent_output() and save_agent_output() with backward-compatible fallback
    that checks both namespaced and legacy paths
  • sandbox agent commands: coder-mode.md, documenter-mode.md,
    integrator-mode.md, tester-mode.md
  • docs: sdlc-pipeline.md, agent-development.md, orchestrator.md
  • local pipeline mock: phase-runner.sh

Backward compatibility: readers fall back to the old global path if the namespaced
file doesn't exist, so in-flight pipelines won't break.

Tests

  • orchestrator/tests/test_handoffs_namespaced.py — 242-line test suite covering
    namespaced reads, fallback to legacy paths, and write paths
  • orchestrator/tests/test_pipeline_prompts.py — expanded to verify namespaced paths
    in prompt generation
  • shared/egg_contracts/tests/test_orchestrator.py — 235-line test suite covering
    path helpers and fallback behavior

Closes #871

Test plan:

  1. Run pytest orchestrator/tests/test_handoffs_namespaced.py — all pass
  2. Run pytest shared/egg_contracts/tests/test_orchestrator.py — all pass
  3. Run pytest orchestrator/tests/test_pipeline_prompts.py — all pass
  4. Verify no globally-named agent-output or check files remain in new pipelines

Authored-by: egg

egg-orchestrator added 28 commits February 23, 2026 07:21
Container 43985389ec6a15b861158676d5595c1076f7306aa4b6048fc26c10298f8bf331 exited with uncommitted changes.
This commit preserves the agent's work-in-progress.

Authored-by: egg
Container cda12d8f2ad0f1b8de10b80c7ece574aa9dc9514399433b150ee16cfa50a06c9 exited with uncommitted changes.
This commit preserves the agent's work-in-progress.

Authored-by: egg
Container d09dc3dd6da2090b5045cf58fa343e548d37e2f69c563d16e966a1959dbbacd4 exited with uncommitted changes.
This commit preserves the agent's work-in-progress.

Authored-by: egg
Container 69ae9374a57d14f6288cea0f022be7439821623fd492b99e463f60eabc7dcaee exited with uncommitted changes.
This commit preserves the agent's work-in-progress.

Authored-by: egg
Container f140cdb5bace1d4c87c8ed24976c41ec649692560a308ddc3ae4fe79f067fbb8 exited with uncommitted changes.
This commit preserves the agent's work-in-progress.

Authored-by: egg
Container 90b2c25febc1000b81a154fdc49dca28e4d232862cb4e3f1642bbf208f62f694 exited with uncommitted changes.
This commit preserves the agent's work-in-progress.

Authored-by: egg
Add issue/pipeline identifier prefix to all globally-named files in
.egg-state/agent-outputs/ and .egg-state/checks/, matching the convention
already used by contracts/, drafts/, and reviews/.

Core changes:
- shared/egg_contracts/orchestrator.py: Add optional identifier param to
  load_agent_output, save_agent_output, collect_handoff_data with fallback
- orchestrator/handoffs.py: Forward identifier through wrapper functions
- orchestrator/dispatch.py: Pass self.contract_key as identifier
- orchestrator/routes/signals.py: Derive identifier from pipeline
- orchestrator/routes/pipelines.py: Update prompt builders and readers
  to use prefixed filenames
- test_pipeline_prompts.py: Update checker/check-and-fix result filename
  assertions to expect prefixed names; add TestNamespacedOutputFilenames
  class for agent prompt output paths; add _read_tester_gaps identifier
  tests for prefixed/fallback behavior
- test_orchestrator.py: New test file with 9 tests for load_agent_output,
  save_agent_output, and collect_handoff_data with identifier parameter
  (prefixed path, fallback, neither-exists, priority, string identifiers)
- phase-runner.sh: Derive identifier from EGG_ISSUE_NUMBER/EGG_PIPELINE_ID
  for namespaced implement-results.json filename
Update all documentation and sandbox agent mode command files to reference
{identifier}-prefixed output filenames instead of global names. Includes
sdlc-pipeline.md, agent-development.md, orchestrator.md, all four agent
mode command files, and the phase_output.py docstring.
@james-in-a-box

This comment has been minimized.

Resolved conflicts in two globally-named .egg-state files that were
written by different pipelines (issue #871 vs #887) — the exact class
of conflict this PR is designed to prevent:

- .egg-state/agent-outputs/risk_analyst-output.json: kept PR branch
  version (issue #871 risk assessment)
- .egg-state/checks/implement-results.json: kept PR branch version
  (issue #871 check results)

Both files are ephemeral pipeline state artifacts. Main's versions
were from a different pipeline run (issue #887, PR #894).
@james-in-a-box
Copy link
Contributor Author

Conflict Resolution Summary

Resolved merge conflicts with main (commit 3dd7f51):

File Category Resolution
.egg-state/agent-outputs/risk_analyst-output.json Additive (different issues) Kept PR branch version (issue #871 risk assessment). Main's version was from issue #887/PR #894.
.egg-state/checks/implement-results.json Additive (different issues) Kept PR branch version (issue #871 check results). Main's version was from issue #887/PR #894.

Context: Both conflicts were in globally-named .egg-state/ files where different pipelines (issue #871 vs #887) wrote different data to the same path — exactly the class of conflict this PR is designed to prevent. The files are ephemeral pipeline state artifacts; keeping the PR branch versions preserves the issue #871 pipeline state.

Auto-merged without conflicts: docs/guides/sdlc-pipeline.md, orchestrator/routes/pipelines.py, sandbox/.claude/commands/{coder,documenter,integrator,tester}-mode.md

Verification:

  • make lint — all checks passed (ruff, mypy, shellcheck, custom checks)
  • make test — 6722 passed, 85 skipped. 2 pre-existing failures in test_docker_client.py (unrelated to this merge; these tests fail on main as well due to missing docker module in sandbox)
  • PR-specific tests: all 209 tests pass (test_handoffs_namespaced.py, test_pipeline_prompts.py, test_orchestrator.py)

— Authored by egg

@james-in-a-box

This comment has been minimized.

Copy link
Contributor

@egg-reviewer egg-reviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No agent-mode design concerns. The structured JSON output files in agent-outputs/ are legitimate machine-to-machine handoffs between agents and the orchestrator — the correct use case for structured output per the guidelines. The shell variable interpolation in sandbox agent mode commands (${EGG_ISSUE_NUMBER:-$EGG_PIPELINE_ID}) is a reasonable way to communicate the namespaced path convention to agents without over-constraining them.

— Authored by egg

@james-in-a-box

This comment has been minimized.

Copy link
Contributor Author

@james-in-a-box james-in-a-box bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contract Verification: APPROVED

Verified all 20 tasks across 5 contract phases for issue #871. Every task's acceptance criteria is satisfied.


Phase 1: Core path construction (shared/egg_contracts/orchestrator.py)

Task Criterion Verdict
task-1-1 load_agent_output identifier param load_agent_output(repo, role, identifier=871) reads from 871-{role}-output.json, falling back to {role}-output.json. Without identifier, reads global path. VERIFIED — Optional `identifier: int
task-1-2 save_agent_output identifier param save_agent_output(repo, role, data, identifier=871) writes to 871-{role}-output.json. Without identifier, writes global. VERIFIED — Conditional path construction: f"{identifier}-{role.value}-output.json" when identifier provided, else f"{role.value}-output.json".
task-1-3 collect_handoff_data identifier param Passes identifier through to load_agent_output for each dependency. VERIFIED — Line 423: dep_output = load_agent_output(repo_path, dep_role, identifier=identifier).

Phase 2: Orchestrator wrapper and caller updates

Task Criterion Verdict
task-2-1 handoffs.py wrappers All four wrappers (save_agent_output, load_agent_output_data, collect_handoff_data, get_handoff_env_var) accept and forward identifier. VERIFIED — All four functions updated with `identifier: int
task-2-2 dispatch.py callers complete_agent() and get_handoff_data() pass self.contract_key as identifier. VERIFIEDsave_agent_output(..., identifier=self.contract_key) and collect_handoff_data(..., identifier=self.contract_key).
task-2-3 signals.py caller handle_complete_signal() passes identifier derived from pipeline. VERIFIED — Derives identifier = pipeline.issue_number if not None else pipeline_id, passes identifier=identifier.

Phase 3: Prompt builder and reader updates (routes/pipelines.py)

Task Criterion Verdict
task-3-1 _build_agent_prompt() prefixed filenames Prompts contain "871-architect-output.json" etc. for issue 871. VERIFIED — All three roles (architect, integrator, risk_analyst) use f"{_identifier}-{role}-output.json".
task-3-2 Checker/autofix prompts Contain "871-implement-results.json". VERIFIED_build_checker_prompt(), _build_check_and_fix_prompt(), _build_autofix_prompt() all construct f"{_id}-implement-results.json".
task-3-3 _synthesize_plan_draft_from_outputs() Reads {identifier}-architect-output.json first, falls back to global. Same for risk_analyst. VERIFIED — Constructs prefixed path, checks prefixed_file.exists(), falls back to outputs_dir / filename.
task-3-4 _read_tester_gaps() Reads {identifier}-tester-output.json first, falls back. Both callers pass identifier. VERIFIED — Function accepts identifier param with fallback. Both callers derive identifier from pipeline.issue_number or pipeline_id.
task-3-5 All remaining callers pass identifier No calls without identifier in orchestrator code. VERIFIED — Grep confirms all invocations pass identifier.

Phase 4: Tests

Task Criterion Verdict
task-4-1 test_pipeline_prompts.py updated All tests pass with prefixed filenames. VERIFIED — Assertions updated to "pid-1-implement-results.json", new TestNamespacedOutputFilenames class with 4 tests, updated TestReadTesterGaps with prefixed/fallback tests, plus new edge case and autofix classes.
task-4-2 Health check tests updated Health check tests pass with prefixed filenames. VERIFIED — These tests don't use load_agent_output/save_agent_output; they create mock files directly and the health check code uses generic keyword matching. No changes needed; tests pass as-is.
task-4-3 test_signals.py mock updated Mock correctly matches updated signature. VERIFIED — The @patch creates a MagicMock that accepts any args. The test makes no assertions on save_agent_output call args. Compatible without changes.
task-4-4 phase-runner.sh updated Writes to {identifier}-implement-results.json. VERIFIED — Derives _IDENT from $EGG_ISSUE_NUMBER or $EGG_PIPELINE_ID, uses RESULTS_FILE="$CHECKS_DIR/${_IDENT}-implement-results.json".
task-4-5 Backward-compat fallback tests All 4 fallback cases pass. VERIFIEDshared/egg_contracts/tests/test_orchestrator.py has explicit tests: (1) prefixed exists → used, (2) only global → fallback, (3) neither → empty dict, (4) both → prefixed priority. Plus additional edge case classes.

Phase 5: Documentation and agent mode commands

Task Criterion Verdict
task-5-1 sdlc-pipeline.md Reflects new naming with examples. VERIFIED — Updated to {identifier}-{role}-output.json with concrete 871-coder-output.json example.
task-5-2 agent-development.md File tree shows prefixed filenames. VERIFIED — File tree updated to {identifier}-coder-output.json etc., with backward compatibility note.
task-5-3 orchestrator.md Architecture doc reflects new naming. VERIFIED — Updated for both agent outputs and check results with examples.
task-5-4 Agent mode command files All 4 files reference prefixed paths. VERIFIED — coder-mode.md, tester-mode.md, integrator-mode.md, documenter-mode.md all use IDENT="${EGG_ISSUE_NUMBER:-$EGG_PIPELINE_ID}" pattern with ${IDENT}-{role}-output.json.
task-5-5 phase_output.py docstring Docstring mentions prefixed filenames. VERIFIED — Docstring updated to reference {identifier}-architect-output.json.

Additional Verification

  • All 6,692 tests pass with 0 failures (per integrator output)
  • Lint clean after integrator fixes (3 issues found and fixed)
  • No orphaned code — all changes map to contract tasks
  • No contract violations detected
  • Backward compatibility maintained via fallback logic in all readers

Verdict: APPROVE

All 20 contract tasks are complete and verified. The implementation correctly namespaces .egg-state files per-pipeline, preventing the merge conflicts described in issue #871. Backward compatibility is preserved through dual-path fallback in all readers.

— Authored by egg

Copy link
Contributor

@egg-reviewer egg-reviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: Namespace .egg-state files per-pipeline

Thorough review of all changed source files (excluding .egg-state/ data artifacts and auto-generated state files).

Verdict: Approve with advisory suggestions

The implementation is correct and well-tested. The approach is straightforward: thread an identifier parameter through all read/write call sites, with backward-compatible fallback to the old global paths. No blocking issues found.


What's Good

  • Consistent pattern: The issue_number if issue_number is not None else pipeline_id idiom is applied uniformly across all call sites, matching PipelineDispatcher.contract_key.
  • Backward compatibility: All readers check the prefixed path first, then fall back to the old global path. This ensures in-flight pipelines work.
  • Comprehensive tests: Three new test suites (477+ lines) cover save/load round-trips, fallback behavior, corrupted files, edge cases (identifier=0), and prompt generation — all at both the shared/egg_contracts layer and the orchestrator/handoffs wrapper layer.
  • Sandbox command files: Correctly use ${EGG_ISSUE_NUMBER:-$EGG_PIPELINE_ID} for the shell-based identifier derivation.

Advisory Suggestions (non-blocking)

1. PR description mentions non-existent helper functions

The PR body states: "shared: egg_contracts/orchestrator.pyget_agent_output_path() and get_check_results_path() helpers with backward-compatible fallback." These functions do not exist — the actual implementation adds identifier parameters to the existing load_agent_output() and save_agent_output() functions. The description should be corrected to avoid confusion for future readers.

2. Repeated identifier derivation pattern — consider a helper

The expression issue_number if issue_number is not None else pipeline_id (or its pipeline.issue_number variant) is repeated 8 times across pipelines.py with different variable prefixes (_identifier, _ck_id, _af_id, _synth_id, _tg_id, _tg_id2). This is manageable but could be a single helper:

def _pipeline_identifier(pipeline: Pipeline, pipeline_id: str) -> int | str:
    return pipeline.issue_number if pipeline.issue_number is not None else pipeline_id

This would reduce risk of divergence if the fallback logic ever changes (e.g., to prefer pipeline.id over the pipeline_id parameter).

3. _synthesize_plan_draft uses truthiness check instead of is not None

At the file lookup in _synthesize_plan_draft:

prefixed_file = outputs_dir / f"{_synth_id}-{filename}" if _synth_id else None

This uses truthiness (if _synth_id) rather than if _synth_id is not None. While safe in practice (since _synth_id is derived from pipeline_id which is always a non-empty validated string), this is inconsistent with every other identifier check in this PR which uses is not None. Recommend changing to if _synth_id is not None for consistency.

4. _read_tester_gaps duplicates fallback logic already in load_agent_output

_read_tester_gaps reimplements its own prefixed-then-fallback file lookup instead of using load_agent_output(repo_path, AgentRole.TESTER, identifier=identifier) which already has the same logic. This is functional but means there are now two independent implementations of the same fallback pattern for tester output specifically. If the fallback behavior ever changes, this function would need a separate update. Consider delegating to the shared helper.

5. Data file inconsistency in this PR

The committed .egg-state/checks/implement-results.json is at the old non-namespaced path, while .egg-state/agent-outputs/871-integrator-output.json is at the new namespaced path. This is expected (the architect/risk_analyst ran before the code changes, while the integrator ran after), but worth noting — the old implement-results.json will remain on main as a vestige. Consider whether a cleanup of old-format files should happen in a follow-up.

— Authored by egg

@james-in-a-box

This comment has been minimized.

@james-in-a-box

This comment has been minimized.

@james-in-a-box

This comment has been minimized.

- Extract repeated `issue_number if issue_number is not None else pipeline_id`
  pattern (11 occurrences) into a `_pipeline_identifier()` helper function
  to reduce divergence risk if the fallback logic changes.

- Fix inconsistent truthiness check in `_synthesize_plan_draft`: the prefixed
  file construction used `if _synth_id` (truthiness) instead of
  `if _synth_id is not None`, which was inconsistent with all other identifier
  checks. Now that the helper always returns `int | str`, the guard is
  unnecessary and has been removed entirely.

Authored-by: egg
Copy link
Contributor Author

@james-in-a-box james-in-a-box bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thorough review. Addressed suggestions #2 and #3 in 930c82c:

#1 (PR description): Agreed — fixing the description now.

#2 (Repeated identifier derivation): Agreed. Extracted all 11 occurrences (including 3 pipeline.issue_number variants the review didn't count) into a _pipeline_identifier(issue_number, pipeline_id) helper at module level.

#3 (Truthiness check): Agreed. Since _pipeline_identifier() always returns int | str, the if _synth_id guard in _synthesize_plan_draft is now unnecessary — removed the conditional entirely so the prefixed path is always constructed.

#4 (_read_tester_gaps duplication): Disagree. _read_tester_gaps intentionally differs from load_agent_output in its error handling: it logs a warning via structlog on parse failure (lines 1660-1665), whereas load_agent_output silently returns {}. This observability is valuable for debugging tester output issues in the pipeline. Delegating to load_agent_output would lose that, or require adding a logger parameter to the shared library. The file-resolution logic is only ~10 lines and the risk of divergence is low given both are tested. Keeping it as-is.

#5 (Data file inconsistency): Acknowledged — the old-format implement-results.json is expected given the timeline. Will consider cleanup in a follow-up if needed.

— Authored by egg

Copy link
Contributor

@egg-reviewer egg-reviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No agent-mode design concerns. The delta since the last review (930c82c) is a pure refactor: extracting _pipeline_identifier() to deduplicate identifier derivation and removing the inconsistent truthiness check. No changes to agent interaction patterns, prompt design, or output consumption.

— Authored by egg

@james-in-a-box

This comment has been minimized.

Copy link
Contributor

@egg-reviewer egg-reviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review: feedback addressing commit 930c82c

Reviewed the delta from 17fc1fa to 930c82c. The single commit addresses suggestions #2 and #3 from the previous review.

Previous feedback status

# Suggestion Status
1 PR description mentions non-existent helpers Author agreed, fixing separately
2 Extract _pipeline_identifier helper Fixed — all 11 occurrences replaced
3 Truthiness check in _synthesize_plan_draft Fixed — conditional removed entirely
4 _read_tester_gaps duplication Author disagreed (structlog error handling differs). Reasonable.
5 Data file inconsistency Acknowledged, follow-up if needed

Review of 930c82c

The _pipeline_identifier helper is clean:

  • Correct type signature: (int | None, str) -> int | str
  • Docstring is accurate
  • Placed at module level near the related constants — appropriate location

All 11 call sites converted correctly. Verified:

  • No remaining raw issue_number if issue_number is not None else pipeline_id patterns
  • Argument types are consistent: (issue_number, pipeline_id) for functions with separate params, (pipeline.issue_number, pipeline_id) for functions with pipeline objects
  • Local variable type annotations were properly removed (mypy infers from return type)
  • The _read_tester_gaps call sites were additionally cleaned up by inlining the helper into the identifier= kwarg, eliminating the intermediate _tg_id/_tg_id2 variables

The _synthesize_plan_draft fix is correct: since _pipeline_identifier always returns int | str, the if _synth_id guard was dead code — removing it and always constructing the prefixed path is the right change.

No new issues found. No tests were changed, which is expected — the refactor is purely internal and the existing tests exercise behavior through the public-facing functions.

— Authored by egg

@james-in-a-box

This comment has been minimized.

Copy link
Contributor Author

@james-in-a-box james-in-a-box bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contract Re-Verification: PR #895 (Issue #871)

Verdict: APPROVE — All contract tasks remain compliant after incremental changes.

Delta Since Last Review (17fc1fa5..930c82c8)

One commit (930c82c8) modifying orchestrator/routes/pipelines.py only:

  1. Extracted _pipeline_identifier() helper — Deduplicates the repeated issue_number if issue_number is not None else pipeline_id pattern (11 callsites) into a single helper function. Pure refactoring; no behavioral change.

  2. Fixed inconsistent truthiness check in _synthesize_plan_draft — The old code used if _synth_id (truthiness) which would skip prefixed path construction for falsy-but-valid identifiers (e.g., 0). Now uses the helper directly, which always returns a non-None value. This is a correctness improvement.

Contract Compliance Summary

All 5 phases verified (re-confirmed from prior review, no regressions):

Phase Tasks Status
1. Core path construction task-1-1 to task-1-3 Compliant — load_agent_output, save_agent_output, collect_handoff_data accept identifier with fallback
2. Orchestrator wrapper/caller updates task-2-1 to task-2-3 Compliant — all wrappers forward identifier; dispatch passes contract_key; signals derive identifier
3. Prompt builder/reader updates task-3-1 to task-3-5 Compliant — all prompt builders use prefixed filenames; _read_tester_gaps has fallback; no orphaned calls
4. Tests task-4-1 to task-4-5 Compliant — comprehensive test coverage including fallback, edge cases, all roles
5. Documentation/agent mode task-5-1 to task-5-5 Compliant — docs and agent mode commands reference namespaced filenames

No Regressions Found

  • No previously verified criteria were broken by the new commit
  • The helper extraction is a clean refactoring that preserves all existing behavior
  • The _synthesize_plan_draft fix is strictly an improvement (handles edge case of falsy identifiers)
  • All 11 replacement sites correctly call _pipeline_identifier(issue_number, pipeline_id) with the same parameter types as the inline pattern they replaced

Minor Advisory Note

orchestrator/routes/signals.py:195 still uses the inline pattern (pipeline.issue_number if pipeline.issue_number is not None else pipeline_id) rather than the helper. This is acceptable — the helper is defined in pipelines.py (different module) and signals.py accesses pipeline.issue_number from a pipeline object rather than a function parameter. Extracting this would require a cross-module import which is not warranted for a single callsite.

— Authored by egg

@james-in-a-box

This comment has been minimized.

@james-in-a-box
Copy link
Contributor Author

egg is addressing review feedback...

@james-in-a-box
Copy link
Contributor Author

egg feedback addressed. View run logs

9 previous review(s) hidden.

@jwbron jwbron merged commit c403108 into main Feb 23, 2026
38 of 39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Namespace .egg-state files per-pipeline to prevent merge conflicts

1 participant