Skip to content

Sync pipeline state branch to remote for durability #857

@james-in-a-box

Description

@james-in-a-box

The orchestrator persists pipeline state on a local-only git branch (egg/pipeline-state) backed by a Docker named volume. This state is never synced to remote, meaning:

  • Volume loss = total state loss — if the Docker volume is deleted or corrupted, all pipeline history is gone
  • No cross-host recovery — spinning up the orchestrator on a different host starts with a blank slate
  • Old pipelines can't be inspected externally — unlike checkpoints (which are pushed to egg/checkpoints/v2), pipeline state is invisible outside the container

Additionally, on startup reconciliation, old pipeline state files with outdated schemas (e.g., removed agent roles like reviewer_unified) cause validation errors that log warnings on every restart indefinitely since there's no migration path or cleanup mechanism.

Proposal:

  1. Push the egg/pipeline-state branch to remote (similar to how egg/checkpoints/v2 works)
  2. Pull on startup to restore state on fresh volumes/hosts
  3. Consider a migration or schema-compat layer for old pipeline state files that don't validate against the current Pydantic model (e.g., skip or auto-migrate unknown enum values)

— Authored by egg

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions