1. Dan Alistarh , Demjan Grubic , Jerry Li , Ryota Tomioka , and Milan Vojnovic . 2017 . QSGD: Communication-efficient SGD via gradient quantization and encoding. Neural Information Processing Systems (NIPS). Dan Alistarh, Demjan Grubic, Jerry Li, Ryota Tomioka, and Milan Vojnovic. 2017. QSGD: Communication-efficient SGD via gradient quantization and encoding. Neural Information Processing Systems (NIPS).
2. Jeremy Bernstein , Yu-Xiang Wang , Kamyar Azizzadenesheli , and Animashree Anandkumar . 2018 . signSGD: Compressed optimisation for non-convex problems . In International Conference on Machine Learning (ICML). Jeremy Bernstein, Yu-Xiang Wang, Kamyar Azizzadenesheli, and Animashree Anandkumar. 2018. signSGD: Compressed optimisation for non-convex problems. In International Conference on Machine Learning (ICML).
3. AdaComp : Adaptive Residual Gradient Compression for Data-Parallel Distributed Training
4. Wei Dai Eric P Xing etal 2018. Toward understanding the impact of staleness in distributed machine learning. arXiv preprint. Wei Dai Eric P Xing et al. 2018. Toward understanding the impact of staleness in distributed machine learning. arXiv preprint.
5. Christopher M De Sa Ce Zhang Kunle Olukotun and Christopher Ré. 2015. Taming the wild: A unified analysis of hogwild-style algorithms. In Neural Information Processing Systems (NIPS). Christopher M De Sa Ce Zhang Kunle Olukotun and Christopher Ré. 2015. Taming the wild: A unified analysis of hogwild-style algorithms. In Neural Information Processing Systems (NIPS).