Skip to content

Commit

Permalink
Added more robust archive file corruption handling (#370)
Browse files Browse the repository at this point in the history
This PR will add an additional check when reading archive hdf5 files so that each group is read once to check that there are no runtime errors due to file corruption in the group

Co-authored-by: Evan Goetz <evan.goetz@ligo.org>
  • Loading branch information
eagoetz and Evan Goetz authored Jun 16, 2023
1 parent 7a3f4d3 commit 77f9a9a
Showing 1 changed file with 13 additions and 0 deletions.
13 changes: 13 additions & 0 deletions gwsumm/archive.py
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,19 @@ def read_data_archive(sourcefile, rm_source_on_fail=True):

with File(sourcefile, 'r') as h5file:

# Make sure that each part of the archive file is not corrupted by
# trying to read the data. If any part is broken, delete the file and
# return without loading anything into the gwsumm.globalv variables
try:
# simple lambda function here to do nothing but visit each item
h5file.visititems(lambda name, obj: None)
except RuntimeError as exc:
if not rm_source_on_fail:
raise
warnings.warn(f"failed to read {sourcefile} [{exc}], removing...")
os.remove(sourcefile)
return

# -- channels ---------------------------

try:
Expand Down

0 comments on commit 77f9a9a

Please sign in to comment.