1. Communication-efficient learning of deep networks from decentralized data;mcmahan;Proc Artif Intell Stat,2017
2. Qsparse-local-SGD: Distributed SGD with quantization, sparsification and local computations;basu;Proc Adv Neural Inf Process Syst,2019
3. Local SGD converges fast and communicates little;stich;arXiv 1805 09767,2018
4. Parallelized stochastic gradient descent;zinkevich;Proc Adv Neural Inf Process Syst,2010