A Time-Series Model for Varying Worker Ability in Heterogeneous Distributed Computing Systems-Reference-Cited by-同舟云学术

A Time-Series Model for Varying Worker Ability in Heterogeneous Distributed Computing Systems

Published:2023-04-16 Issue:8 Volume:13 Page:4993
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Kim Daejin¹^ORCID,Lee Suji²,Jung Hohyun²³^ORCID

Affiliation:

1. Samsung Electronics, Suwon 16677, Republic of Korea

2. Department of Statistics, Sungshin Women’s University, Seoul 02844, Republic of Korea

3. Data Science Center, Sungshin Women’s University, Seoul 02844, Republic of Korea

Abstract

In this paper, we consider the problem of estimating the time-dependent ability of workers participating in distributed matrix-vector multiplication over heterogeneous clusters. Specifically, we model the workers’ ability as a latent variable and introduce a log-normally distributed working rate as a function of the latent variable with parameters so that the working rate increases as the latent ability of workers increases, and takes positive values only. This modeling is motivated by the need to reflect the impact of time-dependent external factors on the workers’ performance. We estimate the latent variable and parameters using the expectation-maximization (EM) algorithm combined with the particle method. The proposed estimation and inference on the working rates are used to allocate tasks to the workers to reduce expected latency. From simulations, we observe that our estimation and inference on the working rates are effective in reducing expected latency.

Funder

National Research Foundation of Korea

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/13/8/4993/pdf

Reference28 articles.

1. Large scale distributed deep networks;Dean;Proc. Adv. Neural Inform. Process. Syst. (NIPS),2012

2. The tail at scale;Dean;Commun. ACM,2013

3. Speeding up distributed machine learning using codes;Lee;IEEE Trans. Inf. Theory,2018

4. Lee, K., Suh, C., and Ramchandran, K. (2017, January 25–30). High-dimensional coded matrix multiplication. Proceedings of the 2017 IEEE International Symposium on Information Theory (ISIT), Aachen, Germany.

5. Yu, Q., Maddah-Ali, M., and Avestimehr, S. (2017, January 4–9). Polynomial codes: An optimal design for high-dimensional coded matrix multiplication. Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Los Angeles, CA, USA.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Design of a Rapid Evaluation System Based on Telemetry Data;2023 3rd International Conference on Communication Technology and Information Technology (ICCTIT);2023-11-24