Add graceful error handling for corrupted shard files#19
Open
Add graceful error handling for corrupted shard files#19
Conversation
Add CorruptedShardError exception and wrap all _restore_shard methods with try/except to catch C++ exceptions from corrupted/truncated shard files. Corrupted shards are now skipped with log warnings instead of crashing the process, allowing the index to remain operational with whatever valid shards remain. When all shards are corrupted, the constructor succeeds with size=0 so consumers can detect and rebuild. Closes #12 https://claude.ai/code/session_01EygYi2hu6fQPvaR5zKCdWs
Add comprehensive tests covering all new corruption handling code paths and previously uncovered branches: - Corrupted shard error metadata/view/load exception paths - Search filtering with active shard overlap (_needs_compact) - Vectors/keys array slow path with dtype conversion - Iterator slow paths with tombstones and compaction - Getitem error paths and empty array edge cases https://claude.ai/code/session_01EygYi2hu6fQPvaR5zKCdWs
- Remove unused import (pathlib.Path) in test_corrupted_shards.py - Remove unused variables (original_load, original_view) - Fix ruff formatting in sharded.py (blank line, string concat) https://claude.ai/code/session_01EygYi2hu6fQPvaR5zKCdWs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds comprehensive error handling for corrupted or truncated shard files in
ShardedIndexandShardedNphdIndex. Instead of crashing when encountering corrupted shards, the index now gracefully skips them with warnings and remains operational with valid shards.Key Changes
New
CorruptedShardErrorexception: A dedicated exception class for corrupted shard errors, exported from the main module for user code to catch if needed.Graceful shard restoration: Modified
_restore_shard()methods inShardedIndex,ShardedIndexedKeys,ShardedNphdIndex, andShardedNphdIndex128to:Noneinstead of raising exceptions when shards are corruptedRobust config resolution: Updated
_resolve_config()and_resolve_max_dim()to iterate through all available shards when reading metadata, skipping corrupted ones until a valid shard is found.Improved
_load_existing()logic:Comprehensive test coverage: Added 275 lines of tests covering:
ShardedIndexandShardedNphdIndexvariantsUpdated existing tests: Modified tests that expected exceptions on key kind mismatches to reflect the new graceful recovery behavior (index opens with size=0).
Implementation Details
ndim/max_dimis provided, the index still opens successfullyhttps://claude.ai/code/session_01EygYi2hu6fQPvaR5zKCdWs