Gradient handling in Workflows #7368

danieltudosiu · 2021-04-07T14:35:43Z

danieltudosiu
Apr 7, 2021
Collaborator

Is your feature request related to a problem? Please describe.
In some cases, gradient clipping or normalization is needed to stabilize the training of networks.

Describe the solution you'd like
Allow the option to do gradient clipping or normalization via an argument at the construction of Workflows.

Describe alternatives you've considered
Registering a hook for each model parameter to handle the gradient clipping, but is dirtier and it is not the main way PyTorch handles it. This would also invalidate the usage of gradient normalization since the PyTorch implementation is an inplace transformation, and the non-inplace gradient clipping will be deprecated. Furthermore, we need to handle the AMP to unscale the gradients before normalizing them, as per https://pytorch.org/docs/stable/notes/amp_examples.html#gradient-clipping.

danieltudosiu · 2021-04-07T16:14:25Z

danieltudosiu
Apr 7, 2021
Collaborator Author

I would consider this quite a high priority since the current setup of Workflows basically does not allow the user to have a way to do gradient normalization and the clipping is forced through the usage of backward pass hooks.

0 replies

Nic-Ma · 2021-04-08T01:59:06Z

Nic-Ma
Apr 8, 2021
Maintainer

Hi @danieltudosiu ,

Can you write a handler attaching to this event and do some logic?
https://github.com/Project-MONAI/MONAI/blob/master/monai/engines/trainer.py#L166
If you need to do more special logic, I would suggest you write your own trainer, just inherit from the current SupervisedTrainer.

Thanks.

0 replies

danieltudosiu · 2021-04-08T05:47:22Z

danieltudosiu
Apr 8, 2021
Collaborator Author

Hi @Nic-Ma,

This solution would require one handler for each of the engines MONAI has. Given the specialized nature, wouldn't it be better fitted as a class method of the Trainer's child?

I already submitted a pull request draft #1967 that is a WIP and has only the SupervisedTrainer logic.

0 replies

Nic-Ma · 2021-04-08T06:19:15Z

Nic-Ma
Apr 8, 2021
Maintainer

Hi @wyli ,

Could you please help share some comments here? I think maybe this is a research-specific feature request that I don't totally understand why not writing a handler.

Thanks in advance.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gradient handling in Workflows #7368

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 4 comments

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Gradient handling in Workflows #7368

danieltudosiu Apr 7, 2021 Collaborator

Replies: 4 comments

danieltudosiu Apr 7, 2021 Collaborator Author

Nic-Ma Apr 8, 2021 Maintainer

danieltudosiu Apr 8, 2021 Collaborator Author

Nic-Ma Apr 8, 2021 Maintainer

danieltudosiu
Apr 7, 2021
Collaborator

danieltudosiu
Apr 7, 2021
Collaborator Author

Nic-Ma
Apr 8, 2021
Maintainer

danieltudosiu
Apr 8, 2021
Collaborator Author

Nic-Ma
Apr 8, 2021
Maintainer