You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
Thank you for your work and sharing.
I try to use convert_model function on my own code, for example:
cudnn.benchmark = True
net = Network()
net.cuda()
net = nn.DataParallel(net, device_ids=args.gpus)
net = convert_model(net)
However, after training, I found that the result is far away from my expectations, even worse than using the nn.BatchNorm2d that comes with the PyTorch. Do I use convert_model function wrongly? Or are there some points to note? Thank you very much!
The text was updated successfully, but these errors were encountered:
Could you please add more detail on this? For example,
What do you mean by “much worse” than the original BatchNorm.
Also, please also check the hyperparameters: are you using the same hyperparameters for all implementations? This includes the batch size, learning rate, and especially the parameters for the BN layers (momentum, eps, etc.)
If you are using a larger batch size due to multi gpu, please scale up the lr.
Hi,
I use the same hyperparameters for all implementations. In order to achieve this, I use the following command to replace BatchNorm function: BatchNorm2d = xxx, where xxx is the batchnorm function I use.
I test three BatchNorm functions on the NYU Depth V2 dataset, (semantic segmantation task), and here is their performance:
Hi,
Thank you for your work and sharing.
I try to use
convert_model
function on my own code, for example:However, after training, I found that the result is far away from my expectations, even worse than using the nn.BatchNorm2d that comes with the PyTorch. Do I use
convert_model
function wrongly? Or are there some points to note? Thank you very much!The text was updated successfully, but these errors were encountered: