1. Lian, X., et al.: Can decentralized algorithms outperform centralized algorithms? A case study for decentralized parallel stochastic gradient descent. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pp. 5330–5340 (2017)
2. Yu, H., Yang, S., Zhu, S.: Parallel restarted SGD with faster convergence and less communication: demystifying why model averaging works for deep learning. In: AAAI’19/IAAI’19/EAAI’19. AAAI Press (2019)
3. Stich, S.U.: Local SGD converges fast and communicates little. In: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net (2019)
4. Koloskova, A., Loizou, N., Boreiri, S., Jaggi, M., Stich, S.: A unified theory of decentralized SGD with changing topology and local updates. In: Daumé III, H., Singh, A. (eds.) Proceedings of the 37th International Conference on Machine Learning, vol. 119. Proceedings of Machine Learning Research, pp. 5381–5393. PMLR, 13–18 Jul 2020
5. Pappas, C., Chatzopoulos, D., Lalis, S., Vavalis, M.: IPLS: a framework for decentralized federated learning. In: 2021 IFIP Networking Conference (IFIP Networking), pp. 1–6 (2021)