[WIP] Visualize the latent space from the auto encoder #245

levje · 2024-09-27T16:45:28Z

Should probably be merged AFTER that PR: #220

Description

It was asked to be able to visualize the latent space based on #220 w.r.t to FINTA from Legarreta et al (2021). As in the original paper, we are projecting the latent space coming out of the auto-encoder into 2D using t-SNE which preserves a smaller distance for similar streamlines and a higher distance for different streamlines.

The class latent_streamlines.py:BundlesLatentSpaceVisualizer is the bulk of the changes done and was made to potentially be reused for other data that needs to be projected and plotted in 2D. I was also asked to be able to visualize the latent space each n epochs during the training. ae_train_model.py holds the code that was added to be able to do that. However, that's probably not the cleanest implementation I could've done for that. For a cleaner implementation, I would've needed the trainer to have more hooks (e.g. at the end of each epoch, I would have a function that would get called to plot the accumulated data and call reset_data()) or simply implement it. And I didn't want to touch the trainer class too much simply for a use-case I thought to be unique and probably won't be useful for other projects.

(Although having a future PR adding hooks everywhere in the trainer/models in a similar fashion to LightningAI or PyTorch.nn.Module add more flexibility to the library in my opinion!)

Scripts:

ae_visualize_bundles.py : Script to test plotting the encoding of a provided list of bundle (one bundle per file) with different colours. Only one bundle can be provided to visualize every streamline with the same colour.
ae_train_model.py : Same script as added by [WIP][NF] Auto-encoders - streamlines - FINTA #220, with the additional part to be able to automatically plot/save the figures at each interval of epochs given by the new argument --viz_latent_space_freq.

[WIP] One feature that might be missing is that if we have the indices of the bundles of the data to be added, we could assign it to their labels directly instead of needing to call add_data_to_plot several times.

Testing data and script

ae_train_model.py \
        $experiments \
	$experiment_name \
	$o_hdf5 \
	target \
	-v INFO \
	--batch_size_training 1200 \
	--batch_size_units nb_streamlines \
	--nb_subjects_per_batch 5 \
	--learning_rate 0.001 \
	--weight_decay 0.13 \
	--optimizer Adam \
	--max_epochs 1000 \
	--max_batches_per_epoch_training 20 \
	--comet_workspace <comet_workpace> \
	--comet_project dwi_ml-ae-fibercup \
	--patience 100 \
	--viz_latent_space_freq 10

Have you

Added a description of the content of this PR above
Followed proper commit message formatting
Added data and a script to test this PR
Made sure that PEP8 issues are resolved
Tested the code yourself right before pushing
Added unit tests to test the code you implemented

People this PR concerns

@arnaudbore @AntoineTheb

This reverts commit 448043b.

pep8speaks · 2024-09-27T16:45:38Z

Hello @levje, Thank you for updating !

In the file dwi_ml/models/projects/ae_models.py:

Line 154:36: E261 at least two spaces before inline comment
Line 154:80: E501 line too long (86 > 79 characters)
Line 198:1: W293 blank line contains whitespace
Line 201:1: W293 blank line contains whitespace
Line 201:1: W391 blank line at end of file

In the file dwi_ml/training/batch_loaders.py:

Line 204:80: E501 line too long (82 > 79 characters)

In the file scripts_python/ae_train_model.py:

Line 115:80: E501 line too long (83 > 79 characters)
Line 139:80: E501 line too long (83 > 79 characters)
Line 142:80: E501 line too long (83 > 79 characters)
Line 143:80: E501 line too long (85 > 79 characters)
Line 155:80: E501 line too long (91 > 79 characters)

In the file scripts_python/ae_visualize_bundles.py:

Line 32:80: E501 line too long (91 > 79 characters)
Line 78:80: E501 line too long (85 > 79 characters)
Line 85:47: E741 ambiguous variable name 'l'
Line 91:80: E501 line too long (82 > 79 characters)
Line 98:80: E501 line too long (104 > 79 characters)

In the file scripts_python/ae_visualize_streamlines.py:

Line 37:80: E501 line too long (95 > 79 characters)
Line 72:80: E501 line too long (85 > 79 characters)
Line 90:80: E501 line too long (84 > 79 characters)

Comment last updated at 2024-09-27 17:23:21 UTC

arnaudbore and others added 15 commits November 23, 2023 13:46

add ae - finta

cd9275a

Merge branch 'master' into add_autoencoder_streamlines

24bb79b

modif with em

31e2e85

add visu

346c383

merge master

5411a21

fix pep8

a341891

answer em comments from nov 2023

72789f3

fix naming class

02cab7b

fix script

7791875

fix viz

f4701be

quick fix ae_vis_streamline

c4bd181

WIP: transformer ae

448043b

Revert "WIP: transformer ae"

7bf9fe3

This reverts commit 448043b.

Latent space visualization integration

5149706

Viz latent space each n epochs

c665cf2

levje added the enhancement New feature or request label Sep 27, 2024

levje self-assigned this Sep 27, 2024

levje added 2 commits September 27, 2024 13:20

autopep8 pass

8df0c0f

Merge branch 'master' into levje/viz-latent-space

7d6de7c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Visualize the latent space from the auto encoder #245

[WIP] Visualize the latent space from the auto encoder #245

levje commented Sep 27, 2024

pep8speaks commented Sep 27, 2024 •

edited

Loading

[WIP] Visualize the latent space from the auto encoder #245

Are you sure you want to change the base?

[WIP] Visualize the latent space from the auto encoder #245

Conversation

levje commented Sep 27, 2024

Description

Testing data and script

Have you

People this PR concerns

pep8speaks commented Sep 27, 2024 • edited Loading

Comment last updated at 2024-09-27 17:23:21 UTC

pep8speaks commented Sep 27, 2024 •

edited

Loading