Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UNet Implementation details #52

Open
SebastienTs opened this issue Aug 18, 2023 · 2 comments
Open

UNet Implementation details #52

SebastienTs opened this issue Aug 18, 2023 · 2 comments

Comments

@SebastienTs
Copy link

SebastienTs commented Aug 18, 2023

I noticed some subtle differences in the implementation of your UNet and the one in this repo: https://github.com/johschmidt42/PyTorch-2D-3D-UNet-Tutorial.

Basically, the batch normalization is performed after the ReLU layers instead of the convolutional layers. Also, the weight are initialized by Xavier_uniform method instead of being set constant.

For my application it leads to better accuracy and speeds up the training by quite some margin.
Have you been testing both approaches and do you have a strong opinion about it?

Related to this, I'm wondering if you would consider including some new UNet variants such as pre-trained encoder and hybrid networks with transformer blocks.

If not, is there a repository that you would particularly recommend to import Pytorch models from for bioimage analysis applications (3D fluorescence LM and 3D EM)?

@mdraw
Copy link
Member

mdraw commented Aug 18, 2023

I have chosen this order of BN / ReLU layers because that was used in other implementations that I looked at (including the paper that originally proposed batch norm) and at that time I knew about optimized fused conv+bn+relu ops but had not seen conv+relu+bn ops.
But if you think about it I guess BN after ReLU makes more sense because ReLU after BN kind of defeats the purpose of normalization: inputs are zero mean, unit variance and ReLU just sets all negative elements to zero (i.e. discarding half of the input elements on average) and reduces variance by half. That doesn't really make sense so if you don't have to stay compatible with earlier neural network architectures I would indeed suggest to switch the order. I also just found a good reddit discussion about this: https://www.reddit.com/r/MachineLearning/comments/67gonq/d_batch_normalization_before_or_after_relu/

Regarding initialization, I would just do experiments with different initializations and use what works best for you. The optimal initialization can vary depending on the architecture.

I haven't done extensive tests of initialization and batch norm placement myself. I can only say that enabling batch norm makes trainings converge much faster consistently than disabling it, but that's not that surprising I guess :)

I am not planning to include more recent neural network architectures because in my experience almost none of the architectures that have been published since the U-Net / 3D U-Net paper actually work better than a well-adapted U-Net, despite of their own reported benchmark results.

If you are looking for well-written fancy new network architecture implementations, I suggest the collection in https://github.com/Project-MONAI/MONAI/tree/dev/monai/networks/nets. You can just install monai and import the network models from the monai package and use them in the same way as the elektronn3-included models without any further changes because they are also just torch.nn.Module subclasses.

@SebastienTs
Copy link
Author

SebastienTs commented Aug 19, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants