Author:
Kim Yusik,Righter Rhonda,Wolff Ronald
Abstract
Parallel processing is a way to use resources efficiently by processing several jobs simultaneously on different servers. In a well-controlled environment where the status of the servers and the jobs are well known, everything is nearly deterministic and replicating jobs on different servers is obviously a waste of resources. However, in a poorly controlled environment where the servers are unreliable and/or their capacity is highly variable, it is desirable to design a system that is robust in the sense that it is not affected by the poorly performing servers. By replicating jobs and assigning them to several different servers simultaneously, we not only achieve robustness but we can also make the system more efficient under certain conditions so that the jobs are processed at a faster rate overall. In this paper we consider the option of replicating jobs and study how the performance of different ‘degrees’ of replication, ranging from no replication to full replication, affects the performance of a system of parallel servers.
Publisher
Cambridge University Press (CUP)
Subject
Applied Mathematics,Statistics and Probability
Reference10 articles.
1. Stochastic Orders
2. Efficient task replication and management for adaptive fault tolerance in Mobile Grid environments
3. A fault-tolerant scheduling problem;Leistman;IEEE Trans. Soft. Eng.,1986
4. [6] Larson S. M. , Snow C. D. , Shirts M. and Pande V. S. (2009). Folding@home and genome@home: using distributed computing to tackle previously intractable problems in computational biology. Preprint. Available at http://arxiv.org/abs/0901.0866.
Cited by
11 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Efficient scheduling in redundancy systems with general service times;Queueing Systems;2024-03-22
2. Scheduling under redundancy;ACM SIGMETRICS Performance Evaluation Review;2022-08-30
3. Service Rate Region: A New Aspect of Coded Distributed System Design;IEEE Transactions on Information Theory;2021-12
4. Achievable Stability in Redundancy Systems;ACM SIGMETRICS Performance Evaluation Review;2021-06-22
5. Achievable Stability in Redundancy Systems;Abstract Proceedings of the 2021 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems;2021-05-31