-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
Context
The Entireio CLI uses 256-bucket sharded storage for checkpoint metadata: <id[:2]>/<id[2:]>/metadata.json. This distributes files across directories to avoid performance degradation from large flat directory listings in Git.
Current State
Our checkpoint system stores data on the egg/checkpoints/v2 branch with a multi-dimensional index structure. As the number of checkpoints grows (especially with multi-agent pipelines producing many sessions per issue), the branch could accumulate significant data.
Proposal
Evaluate and potentially adopt sharded storage for checkpoint metadata:
- Shard by first 2 characters of checkpoint ID (256 buckets)
- Keeps directory sizes manageable for Git operations
- Improves
git ls-treeandgit checkout -- pathperformance - Consider also adding checkpoint pruning/archival for old data
This is a scalability concern — not urgent but worth addressing before the checkpoint branch becomes unwieldy.
Reference
See entireio/cli — sharded path structure in checkpoint storage.
Authored-by: egg
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels