Skip to content

Feature: Raft Log Compaction (Snapshots) for Metadata Service #11

@AnishMulay

Description

@AnishMulay

Summary

Implement periodic and on-demand snapshotting for Raft state in the MetadataService to bound log growth and speed up recovery.

Why it matters

Prevents unbounded log size, reduces startup time, and aligns with production systems.

Scope

  • Design snapshot format for metadata state (file-to-chunk mappings, cluster term/index).
  • Implement InstallSnapshot/SaveSnapshot integration.
  • Trigger policies (size threshold, time-based, and leadership change).
  • Backward-compatible restore on restart.

Acceptance Criteria

  • Node can restart from snapshot and catch up via incremental logs.
  • Logs truncated safely after snapshot persistence.
  • Fuzz/chaos tests show correct recovery with mixed snapshot/log replay.
  • Benchmarks show ≤30% restart time vs. no compaction at 100k ops.

Notes

Consider pluggable snapshot store (local fs first), atomic write (temp file + rename), and CRC.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions