Affiliation:
1. Paderborn University
2. Hasso-Plattner-Institut
Abstract
Abstract
We consider methods where processors from a distributed computing (DC) infrastructure compute updates for a set of parameters asynchronously. In such scenarios, the parameter updates can experience practically unbounded stochastic processing times caused by effects like queuing, processor sharing, priorities, preemption, or heavy-tailed traffic. As a result, processors will update parameters multiple times while one processor observes the parameters and calculates a new parameter update based on it. The resulting error between the current parameter and the older version used to calculate the parameter update is thus a function of a discrete information delay that we call Age-of-Information (AoI). To counter the errors caused by AoI, predict the performance of asynchronous algorithms, and effectively solve problems in machine learning and artificial intelligence, it is important to know AoI properties. To do this, we model the processing times in a DC system as parallel renewal processes. For this model, we derive the distribution and moment bounds for the discrete AoI affecting asynchronous algorithms executed on the DC system. We also derive exact expressions for the asymptotic mean and sharp bounds for the asymptotic variance.
Publisher
Research Square Platform LLC
Reference106 articles.
1. Koloskova, Anastasiia and Stich, Sebastian U and Jaggi, Martin (2022) Sharper convergence guarantees for asynchronous sgd for distributed and federated learning. Advances in Neural Information Processing Systems 35: 17202--17215
2. Haan, Laurens and Ferreira, Ana (2006) Extreme value theory: an introduction. Springer
3. Gasull, Armengol and L{\'o}pez-Salcedo, Jos{\'e} A and Utzet, Frederic (2015) Maxima of Gamma random variables and other Weibull-like distributions and the Lambert W W function. Test 24: 714--733 Springer
4. Hinterstoisser, Stefan and Lepetit, Vincent and Wohlhart, Paul and Konolige, Kurt (2018) On pre-trained image features and synthetic images for deep learning. 0--0, Proceedings of the European Conference on Computer Vision (ECCV) Workshops
5. Cohen, Alon and Daniely, Amit and Drori, Yoel and Koren, Tomer and Schain, Mariano (2021) Asynchronous stochastic optimization robust to arbitrary delays. Advances in Neural Information Processing Systems 34: 9024--9035