You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Appendix C of the paper mentions the learning rate and momentum decay but not the momentum switch, which would seem to force the momentum back up to a relatively large value for most of the optimization. Is the momentum decay useful in practice?
The text was updated successfully, but these errors were encountered:
I don't think that the final momentum value is ever used. The momentum-decay line uses a hard-coded value of
0.85
:ft-SNE/tsne.py
Lines 186 to 187 in 4b5c368
Appendix C of the paper mentions the learning rate and momentum decay but not the momentum switch, which would seem to force the momentum back up to a relatively large value for most of the optimization. Is the momentum decay useful in practice?
The text was updated successfully, but these errors were encountered: