Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy image #1232

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Lazy image #1232

wants to merge 5 commits into from

Conversation

romainVala
Copy link
Contributor

@romainVala romainVala commented Nov 5, 2024

Fixes #328 #1224

Description
Allow lazy loading, as describe previously (especially recently in #1224)

The previous PR #328 was intended to do so, but was never merged to the main, instead this pull request #454 was added in order to let the user defined is onw format (so including hdf5 for instance),

This is nice, since it avoid to add extra dependency (like h5py ). But it assume, the custom reader, will end up with a full tensor. Here we need to avoid reading the whole image,

The idea with this lazyImage class is not to use any transform (because most tranforms would require to load the whole volume) but just to use a sampler.
Once we get patches, one can then use any transform

import torchio as tio, torch, numpy as np
import h5py

#create a random hdf5 image
new_shape  = [1000,1000,1000]
file_out = h5py.File("random_int.hdf5", "w")
hdf5_data = file_out.create_dataset("data", [1] + new_shape , dtype='i')
hdf5_affine = file_out.create_dataset("affine", [4,4]) 
hdf5_affine[:] = np.eye(4)
hdf5_data[:] = np.random.randint(1,100,[1] + new_shape )
del hdf5_data , file_out

#read and test lazy loading
def read_hdf5(path, data_key='data', affine_key='affine'):
    file_out = h5py.File(path, "r")
    hdf5_data = file_out[data_key]        # do not read from disk
    hdf5_affine = file_out[affine_key][:] # read all from disk
    return hdf5_data, hdf5_affine

f_in = "random_int.hdf5"
img = tio.LazyScalarImage(f_in, reader=read_hdf5)
suj_hdf5 = tio.Subject({'t1': img })
sampler = tio.data.GridSampler(suj_hdf5,256,0)

# get patches, without loading the whole volume !!! 
for nump, one_patch in enumerate(sampler): #no DataLoader so that we keep Subject (no batch dimension)
    locations = one_patch[tio.LOCATION]
    print(f'P {nump} loc {locations}')

every thing was almost already working out of the box, the important change I had to make is for the Image.load function: We do not allow multiple path since it would require a torch.cat

changes related to _parse_tensor _parse_tensor_shape are just to avoid line expecting torch.tensor so we may make less check here but it is a good start
same for repr function, change are only for memory and dtype attribute

finally important change in crop function to make it work without tensor (just indexed array), (almost nothing since we are no more dependent on sitk here !!!)

Checklist

  • I have read the CONTRIBUTING docs and have a developer setup (especially important are pre-commitand pytest)
  • Non-breaking change (would not break existing functionality)
  • Breaking change (would cause existing functionality to change)
  • Tests added or modified to cover the changes
  • Integration tests passed locally by running pytest
  • In-line docstrings updated
  • Documentation updated, tested running make html inside the docs/ folder
  • This pull request is ready to be reviewed

Copy link

codecov bot commented Nov 5, 2024

Codecov Report

Attention: Patch coverage is 41.66667% with 28 lines in your changes missing coverage. Please review.

Project coverage is 84.16%. Comparing base (5ece9c7) to head (6b92212).

Files with missing lines Patch % Lines
src/torchio/data/image.py 28.94% 27 Missing ⚠️
...c/torchio/transforms/preprocessing/spatial/crop.py 75.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1232      +/-   ##
==========================================
- Coverage   84.51%   84.16%   -0.35%     
==========================================
  Files          92       92              
  Lines        5966     6013      +47     
==========================================
+ Hits         5042     5061      +19     
- Misses        924      952      +28     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@fepegar
Copy link
Member

fepegar commented Nov 16, 2024

Hi @romainVala. I'm not sure I understand what's going on here. Can you maybe share a usage example? Is this what @csparker247 had in mind? Do you know why the tests are failing? Can this be tested? Codecov is complaining that most of the new lines are not covered by the unit tests.

@romainVala
Copy link
Contributor Author

romainVala commented Nov 18, 2024

Hi @romainVala. I'm not sure I understand what's going on here. Can you maybe share a usage example? Is this what @csparker247 had in mind? Do you know why the tests are failing? Can this be tested? Codecov is complaining that most of the new lines are not covered by the unit tests.

Hi @fepegar
About the objective I believe this is an important change for torchio, because it extends its usage to cases where 3D (or 2D) images are too large to be loaded at once into memory. For instance histology data, which are evolving toward 3D full data (Spatial transcriptomic for instance), can not today use torchio just because if one create a torchio Image then the all volume is loaded into memory (or more precisely, the Image constructor is expecting either a tensor or a numpy array ...)

I always thought there is no point to work on this direction because most transform are expecting the whole image to be loaded. But talking with @csparker247 make me realize two things
1_ It can be still interesting to use torchio transform at the patch level
2_ With more work, some spatial transform like affine and elastic deformation could be perform at the patch level (ie without loading the whole volume into memory) and still get a globally equivalent transform (as if we had loaded the wole volume). Note that it will not be the case for motion simulation since motion need to happen in fourier space .... (and I have no clue how to perform a fourier transform by patch ...)

I was excited to code the point 2, but now I am not sure this is the main need. First point is already a big improvement for torchio, (and it should be enougth for training)

Any way this PR focus only on the first point. About a usage example, what is not clear in the one I give above ? it shows exactly the use case we are interested in : How to parse the whole volume with a GridSampler (or any other sampler) without loading the whole volume into memory

Sorry for this PR which is not ready at all. I did not try to add tests since I am not yet sure this is the best way to go. In this comment #1224 (comment) I give alternative changes to do in torchio in order to make it possible.

As you can see their are only two small issue for current code to work

first one is to allow the Image construction with data containing other type than numppy or tensor (https://github.com/fepegar/torchio/blob/ec95cd31b5587cd3d1717b7b64f48e92a2f4f175/src/torchio/data/image.py#L490). This is a major change because the all logic of torchio is to rely on tensor. But since you the current code already handle the possibility to construct an Image without reading the data, I thinks it will be not a big change for torchio ...
The second change is a minor change in crop.py to avoid using .clone (which only work for tensor)

The current PR is an alternative coding, which make the choice to separate the 'usual' Image loading from a "virtual" loading by introducing a new Image class (LazzyImage) . But the drawback is that I need to duplicate part of the code with the same logic as Image Class. (there new test need to be added) buto I am not sure this is the best way to go

May be the initial change I proposed here is a better solution
If you agree I can close this PR and propose a new one ... ?

Sorry for the (again) too long message, but I hope it still helps to understand ...

@csparker247
Copy link

As you say, this isn't the full thing I mentioned, but it certainly is a logical first step as far as I can tell.

Another way to think about this is that these changes represent a minimal set of modifications one could make to the Image class (and certain library functions) which seem to make TorchIO more or less agnostic to the type of the array being wrapped by Image. I think this is generally a nice solution since it doesn't add any new imports or ever try to guess what those types could be.

It does make me wonder, why does this need to be a new class and not just changes to the Image class? Is there a strong reason why Tensor was used as the internal storage type in the first place that might be getting overlooked here? The read_and_check() and clone() calls that still remain would make me wonder about the internal library assumptions.

Maybe this class should get deployed and tested in the wild as an experimental feature for a bit, then its functionality will get merged back into Image? Lots of options.

@romainVala
Copy link
Contributor Author

romainVala commented Nov 18, 2024

@csparker247 what part of the full thing is missing here ?

Do I understand correctly that the missing part is what I mentioned in 2): ie having a full spatial (itk) transform coherent at on the whole volume but that is processed by patches ?

If yes, can you explain some use case ?
for the data augmentation perspective, I think it is enough to do random spatial transform at the patch level... (you then just need to take a larger patch to avoid NaN issue with displacement, and then crop to the wanted patch size)

As mentioned in the previous discussion I have a solution for the Affine (and elastic) transform but this imply much more changes and my feeling is that it would be better to start with code cleaning to regroup all itk-related functions, and try to make a uninque Itk transform class.

About the changes proposed here, I do agree that the extra class (LazyImage) may not be necessary. I would prefer changing the torchio assumption that every data need to be a tensor. (@fepegar we need you advice here ) Doing so the needed changes are minimal, but we need to be careful since it may induce some incoherence

May be an in-between solution would be to just add an extra attribute, like "VirtualData" so that we can easily check and adapt the current logic where needed at least to give an explicit error message, if the user try to use a "full volume" transform on lazy Images, (since it will blow up the cpu memory)

@csparker247
Copy link

Do I understand correctly that the missing part is what I mentioned in 2): ie having a full spatial (itk) transform coherent at on the whole volume but that is processed by patches ?

If yes, can you explain some use case ?
for the data augmentation perspective, I think it is enough to do random spatial transform at the patch level... (you then just need to take a larger patch to avoid NaN issue with displacement, and then crop to the wanted patch size)

Yes, this is what I meant. What I eventually want is to apply a random transform to the whole volume and then sample random patches from it:

transform = tio.RandomAffine(
    scales=(0.9, 1.2),
    degrees=15,
)
transformed = transform(subject)
for patch in sampler(transformed):
  do_train(patch)

I don't see how applying at the patch level gives you the same thing without changes to the internals of the transform classes here:

for patch in sampler(subject):
   # get new patch origin and dimensions to add buffer
   origin, dims = ...
   
   # reload the patch from the volume with this new size
   patch_new = ...
   
   # apply transform 
   # **random transform of patch is not the same as one transform of whole volume**
   transformed = transform(patch_new) 
   
   # crop to expected size
   patch = crop(patch_new, patch.shape)
   
   do_train(patch)

Anyway, like I say, I think this is a good first step.

@romainVala
Copy link
Contributor Author

I don't see how applying at the patch level gives you the same thing without changes to the internals of the transform classes here:

Yes it does. As I said it needs much bigger, changes (in the transform and in the patch sampler )

@csparker247
Copy link

csparker247 commented Nov 18, 2024

Then I misunderstood what you were proposing for Step 2, and that you were saying that Step 2 was already covered with this change. I see now that's not what you meant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants