-
Notifications
You must be signed in to change notification settings - Fork 19
Description
Hello,
I apologize for disturbing you! Thank you for your open-source project ReST, which has been incredibly helpful for my studies. During my learning process, I encountered the following issues:
Based on the original code, I only modified the parameters related to the dataset settings in the Wildtrack.yml file. During the training of the SG model, the results of the first and second trainings were correct . However, starting from the third training, the model could not converge during training. After avg_train_loss reached 0.0045, the loss value stopped decreasing. In all subsequent training attempts, I could not obtain correct results, and the same error as in the third training occurred.
I printed the gradient of the model's back-propagation during each training session in the training logs. I found that the gradient values decreased rapidly, and I suspect this is the cause of the issue.
Notes:
The code was not modified between these training runs; it was exactly the same, and the intervals between runs were very short.
In the attachments, log_1 and log_2 are the correct training logs; log_3 and log_4 are the incorrect training logs; log_5 and log_6 are the incorrect training logs with the training gradients saved.
I am not sure how to resolve this issue and hope you can guide me on how to fix it. Looking forward to your reply, thank you very much.
log_1.txt
log_2.txt
log_3.txt
log_4.txt
log_5.txt
log_6.txt