1. Hogwild: A lock-free approach to parallelizing stochastic gradient descent;Recht;Advances in neural information processing systems,2011
2. Taming the wild: A unified analysis of hogwild-style algorithms;Sa;Advances in neural information processing systems,2015
3. Stochastic Gradient Descent Tricks
4. Large scale distributed deep networks;Dean,2012
5. Accurate, large minibatch SGD: Training ImageNet in 1 hour;Goyal;ArXiv, vol. abs/1706.02677,2017