Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Recommended Chunk Shape doesn't take into account compound dtypes #1122

Open
2 tasks done
pauladkisson opened this issue Oct 29, 2024 · 2 comments
Open
2 tasks done
Labels

Comments

@pauladkisson
Copy link
Member

What happened?

Dev tests are failing: https://github.com/catalystneuro/neuroconv/actions/runs/11567195585/job/32197117750

Tracked this down to an update with hdmf that uncovered these lines: https://github.com/hdmf-dev/hdmf/blob/dev/src/hdmf/backends/hdf5/h5tools.py#L1476-L1483

And our chunking recommendation based on the data shape here: https://github.com/catalystneuro/neuroconv/blob/main/src/neuroconv/tools/nwb_helpers/_configuration_models/_base_dataset_io.py#L263

Notice how in hdmf, if the data has a compound dtype, the shape is (len(data),), but in neuroconv the shape is always get_data_shape(data).

This throws an error when they mismatch in the case of a Caiman pixel_mask, which is a compound dtype in NWB.

The initial solution that I came up with was to load the NWB schema to figure out if a dataset is compound or not, but I was having some trouble finding the right code to load in the schema...

Steps to Reproduce

n/a

Traceback

No response

Operating System

Linux

Python Executable

Conda

Python Version

3.9

Package Versions

No response

Code of Conduct

@pauladkisson
Copy link
Member Author

Update:
I figured out that I can use the builder's dtype (from io.get_builder(dataset)) to appropriately check for compound dtypes, BUT io.get_builder only works when the nwbfile is being read from disk -- it returns None when the nwbfile is in memory.

@pauladkisson
Copy link
Member Author

@rly, when you have a chance could you provide some guidance on this issue?

How can I get a builder from an in-memory nwbfile? Or, if that is too difficult, how can I get access to the schema for a given neurodata object?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant