-
Notifications
You must be signed in to change notification settings - Fork 12
Closed
Description
Problem
RestoreFromS3() rebuilds the in-memory segment list from S3 on broker restart. If any .index file download fails or fails to parse, it returns a hard error, blocking the entire partition from initializing.
Orphaned .kfs files (without a matching .index) are created when uploadFlush successfully uploads the .kfs but the .index upload fails. Since Drain() clears the buffer before uploads begin and onFlush is never called on failure, the orphaned .kfs represents data that was never acknowledged to the producer — no committed data is lost by skipping it.
Currently, the only recovery from an orphaned .kfs is manual deletion from S3.
Secondary issues:
- last offset is derived from the raw entries list rather than the successfully loaded segments — if we change the index-download error to continue, this becomes a real bug since the last entry may be an orphan.
- Read() returns ErrOffsetOutOfRange when a requested offset falls in a gap between segments. Kafka returns records from the next available segment in this case.
Proposal
TBD
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels