Asynchronous stochastic gradient descent

Somehow I missed the very recent trend in using asynchronous SGD in model training instead of synchronous SGD. The main goal of this writing is simply showing several resources that I found helpful to understand asynchronous SGD.

http://engineering.skymind.io/distributed-deep-learning-part-1-an-introduction-to-distributed-training-of-neural-networks (Highly recommend) A general introduction to ASGD methods in general.
Large Scale Distributed Deep Networks - http://www.cs.toronto.edu/~ranzato/publications/DistBeliefNIPS2012_withAppendix.pdf A somewhat naive ASGD method.
HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent - https://arxiv.org/abs/1106.5730 A classic paper that ignites trend in using ASGD in model training.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Asynchronous stochastic gradient descent.md

Asynchronous stochastic gradient descent.md

Files

Asynchronous stochastic gradient descent.md

Latest commit

History

Asynchronous stochastic gradient descent.md

File metadata and controls