You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In paper, the optimal learning rate is 2e-3.
In the pretrain.sh, the learning rate is set to 5e-4.
Could you please advise the best learning rate to train MobileLLM models?
The text was updated successfully, but these errors were encountered:
In paper, the optimal learning rate is
2e-3
.In the
pretrain.sh
, the learning rate is set to5e-4
.Could you please advise the best learning rate to train MobileLLM models?
The text was updated successfully, but these errors were encountered: