Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix MultiDimImageDataset metadata handling #458

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

benjijamorris
Copy link
Contributor

What does this PR do?

  • Fixes metadata handling from the multidim_image_dataset
  • removes functionality that overlapped with bioio_loader
    • Note The first transform after multidim_image_dataset should now be the bioio_loader as the multidim image dataset now just returns a dict with image loading args
  • Update aicsimageloader name and add metadata functionality

Before submitting

  • Did you make sure title is self-explanatory and the description concisely explains the PR?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you list all the breaking changes introduced by this pull request?
  • Did you test your PR locally with pytest command?
  • Did you run pre-commit hooks with pre-commit run -a command?

Did you have fun?

Make sure you had fun coding 🙃

"""Dataset converting a `.csv` file listing multi dimensional (timelapse or multi-scene) files
and some metadata into batches of single- scene, single-timepoint, single-channel images."""
class MultiDimImageDataset(CacheDataset):
"""Dataset converting a `.csv` file or dictionary listing multi dimensional (timelapse or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain how this is different from DataframeDatamodule?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be really useful to have an example function call or config for each dataset/datamodule

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed on the config. This is standardizing the creation of a dataframe from a multi-scene/multi-timepoint image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so youre saying dataframedaramodule does not work for multi-scene/multi-timepoint?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does if you enumerate all the scenes and timepoints for each multidim image in their own rows. Here, each row is just the multidim image path and the scenes/channels/timepoints you want to use

img = np.expand_dims(img, 0)
return img

def create_metatensor(self, img, meta):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we not doing all this anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part was just baking in some transforms in an easily-broken way. I think it's better to leave all of this to downstream transforms (for example, all of this is handled by the bioio image loader)

kwargs = {k: self.split_args(data[k]) for k in self.kwargs_keys if k in data}
if self.dask_load:
img = img.get_image_dask_data(**kwargs).compute()
else:
img = img.get_image_data(**kwargs)
img = img.astype(self.dtype)
data[self.out_key] = MetaTensor(img, meta={"filename_or_obj": path, "kwargs": kwargs})

kwargs.update({"filename_or_obj": self._get_filename(path, kwargs)})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filename_or_obj us a monai thing right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes - we also use it for image saving

ritvikvasan
ritvikvasan previously approved these changes Feb 20, 2025
mfs4rd
mfs4rd previously approved these changes Feb 20, 2025
@benjijamorris
Copy link
Contributor Author

Sorry had to re-request review after adding an example config! No code changes

ritvikvasan
ritvikvasan previously approved these changes Feb 20, 2025
num_workers: 8
dict_meta:
path:
- /path/to/your/multidim_image.zarr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this points to a single image path? you can choose any because youre interested in metadata?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants