RSB BCE and 1-vs-all #1701
Unanswered
concrete13377
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Thank you so much for sharing your Resnet Strikes Back paper - the results are truly impressive!
However, I am struggling to understand a few things about the BCE loss. Would it be possible for you to kindly explain the intuition behind why it works better for Mixup/Cutmix augmentations? Also, am I correct in understanding that sigmoid is used during training and then changed to softmax during inference? Furthermore, in the paper, there is some discrepancy between the task being referred to as multiclass 1-vs-all in one place and multilabel 1-vs-all in another. Would you kindly clarify which one is correct?
I appreciate your time and help.
Thank you once again!
Beta Was this translation helpful? Give feedback.
All reactions