Skip to content

NaN loss issue during training #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
abrar-aw opened this issue Feb 2, 2023 · 1 comment
Open

NaN loss issue during training #5

abrar-aw opened this issue Feb 2, 2023 · 1 comment

Comments

@abrar-aw
Copy link

abrar-aw commented Feb 2, 2023

Hi there, I really appreciate your work and I am trying to implement the model on a custom dataset consisting of just 100 words in native sign language. Each word contains about 8 video samples. However, when I try to train your custom model, I keep getting NaN loss and NaN validation loss. I have so far tried tweaking it with decreasing the learning rate, batch size and even some parameters in the model, like the dropout rate and different RNN like LSTM, but I just can't seem to get it out of NaN values. The dataset I am using is similar to WLASL dataset, 256x256 res @ 25 FPS. Could you please advise on what I could do to get it to work? I need some expert opinion on this please, thanks!!

Added info: This only happens when I am trying to train more 20 classes/glosses.

@simonefinelli
Copy link
Owner

I think the number of videos may have an impact. In fact, by increasing the number of classes, if there are few videos for each case, then it could be that the network has a loss that is too large to be represented (the network has a lot of difficulty in distinguishing cases). Also make sure that the number of output neurons of the last layer coincides with the number of classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants