1. Dan Alistarh , Demjan Grubic , Jerry Li , Ryota Tomioka , and Milan Vojnovic . 2017 . QSGD: Communication-efficient SGD via gradient quantization and encoding. In Advances in Neural Information Processing Systems. 1709–1720. Dan Alistarh, Demjan Grubic, Jerry Li, Ryota Tomioka, and Milan Vojnovic. 2017. QSGD: Communication-efficient SGD via gradient quantization and encoding. In Advances in Neural Information Processing Systems. 1709–1720.
2. Demystifying Parallel and Distributed Deep Learning
3. Luke N Darlow Elliot J Crowley Antreas Antoniou and Amos J Storkey. 2018. Cinic-10 is not imagenet or cifar-10. arXiv preprint arXiv:1810.03505(2018). Luke N Darlow Elliot J Crowley Antreas Antoniou and Amos J Storkey. 2018. Cinic-10 is not imagenet or cifar-10. arXiv preprint arXiv:1810.03505(2018).
4. Priya Goyal , Piotr Dollar , Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. 2017 . Accurate, Large Minibatch SGD : Training ImageNet in 1 Hour. In arXiv preprint arXiv:1706.02677. Priya Goyal, Piotr Dollar, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. 2017. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour. In arXiv preprint arXiv:1706.02677.
5. Local SGD with Periodic Averaging: Tighter Analysis and Adaptive Synchronization;Haddadpour Farzin;Advances in Neural Information Processing Systems,2019