Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update indexer to treat .zarr as files not dirs #1046

Merged
merged 2 commits into from
May 5, 2024

Conversation

akhanf
Copy link
Contributor

@akhanf akhanf commented Mar 5, 2024

Microscopy file formats were recently updated in the standard to include ome-zarr, which are stored in a directory suffixed with .ome.zarr.

https://bids-specification.readthedocs.io/en/stable/modality-specific-files/microscopy.html#file-formats

This however doesn't work with pybids currently, since the pybids just indexes it as a directory.. This can cause issues if there is an ome.zarr folder with real data, since there are a large number of files in that folder which slow the indexing to a crawl..

This adds a quick and simple (albeit hardcoded) fix to ensure .zarr directories get indexed as files instead of directories. This enables us to get .ome.zarr microscopy files, and also prevents needless indexing of .zarr directories that contain a large number of files.

This solves my problem as I can now use ome.zarr in bids, seemingly without any negative side effects (?) - a more configurable solution would seem nicer of course, but not sure how often this would come up in the future (ie bids "files" that are actually directories), so not sure it would be worth the added config set-up?

Thoughts?
Pinging @satra as someone in the BIDS community that was also involved in the ome-zarr standard.

Note: .ome.zarr does not yet seem to be in the pybids config, so using validate=False in the layout to test things..

Microscopy file formats were recently updated in the standard to include
ome-zarr, which are stored in a directory suffixed with '.ome.zarr'.

This adds a simple hardcoded fix to ensure .zarr directories get indexed as files
instead of directories. This enables us to get .ome.zarr microscopy
files, and also prevents needless indexing of .zarr directories that
can contain a huge number of files.
Copy link

codecov bot commented Mar 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.80%. Comparing base (6aaa5f2) to head (d556233).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1046   +/-   ##
=======================================
  Coverage   89.79%   89.80%           
=======================================
  Files          63       63           
  Lines        7175     7178    +3     
  Branches     1372     1374    +2     
=======================================
+ Hits         6443     6446    +3     
  Misses        532      532           
  Partials      200      200           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@effigies
Copy link
Collaborator

effigies commented May 5, 2024

Agreed a more configurable solution would be nice. But I think that can wait until we depend on the BIDS schema, which does indicate extensions of data "files" that are actually directories.

@effigies effigies merged commit 235cd25 into bids-standard:master May 5, 2024
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants