1. Naman Agarwal Ananda Theertha Suresh Felix Yu Sanjiv Kumar and H. Brendan McMahan. 2018. cpSGD: Communication-efficient and differentially-private distributed SGD. In Proceedings of the Advances in Neural Information Processing Systems . 7575–7586.
2. Dan Alistarh, Torsten Hoefler, Mikael Johansson, Nikola Konstantinov, Sarit Khirirat, and Cédric Renggli. 2018. The convergence of sparsified gradient methods. In Proceedings of the Advances in Neural Information Processing Systems. 5973–5983.
3. Rotem Zamir Aviv, Ido Hakimi, Assaf Schuster, and Kfir Yehuda Levy. 2021. Asynchronous distributed learning: Adapting to gradient delays without prior knowledge. In Proceedings of the International Conference on Machine Learning. PMLR, 436–445.
4. Debraj Basu Deepesh Data Can Karakus and Suhas Diggavi. 2019. Qsparse-local-SGD: Distributed SGD with quantization sparsification and local computations. In Proceedings of the Advances in Neural Information Processing Systems . 14695–14706.