Affiliation:
1. IBM T. J. Watson Research, Hawthorne, NY, USA
Abstract
MapReduce/Hadoop production clusters exhibit heavy-tailed characteristics for job processing times. These phenomena are resultant of the workload features and the adopted scheduling algorithms. Analytically understanding the delays under different schedulers for MapReduce can facilitate the design and deployment of large Hadoop clusters. The map and reduce tasks of a MapReduce job have fundamental difference and tight dependence between them, complicating the analysis. This also leads to an interesting starvation problem with the widely used Fair Scheduler due to its greedy approach to launching reduce tasks. To address this issue, we design and implement Coupling Scheduler, which gradually launches reduce tasks depending on map task progresses. Real experiments demonstrate improvements to job response times by up to an order of magnitude.
Based on extensive measurements and source code investigations, we propose analytical models for the default FIFO and Fair Scheduler as well as our implemented Coupling Scheduler. For a class of heavy-tailed map service time distributions, i.e., regularly varying of index -a, we derive the distribution tail of the job processing delay under the three schedulers, respectively. The default FIFO Scheduler causes the delay to be regularly varying of index -a+1. Interestingly, we discover a criticality phenomenon for Fair Scheduler, the delay under which can change from regularly varying of index -a to -a+1, depending on the maximum number of reduce tasks of a job. Other more subtle behaviors also exist. In contrast, the delay distribution tail under Coupling Scheduler can be one order lower than Fair Scheduler under some conditions, implying a better performance.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Networks and Communications,Hardware and Architecture,Software
Reference27 articles.
1. Fair Scheduler http://hadoop.apache.org/mapreduce/docs/r0.21.0/fair_scheduler.html. Fair Scheduler http://hadoop.apache.org/mapreduce/docs/r0.21.0/fair_scheduler.html.
2. Capacity Scheduler http://hadoop.apache.org/mapreduce/docs/r0.21.0/capacity_scheduler.html. Capacity Scheduler http://hadoop.apache.org/mapreduce/docs/r0.21.0/capacity_scheduler.html.
3. The impact of the service discipline on delay asymptotics
4. The Case for Evaluating MapReduce Performance Using Workload Suites
Cited by
42 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献