Affiliation:
1. Motilal Nehru National Institute of Technology Allahabad, Prayagraj, India
Abstract
Big data processing technology marks a prominent place in today's market. Hadoop is an efficient open-source distributed framework used to process big data with fewer expenses utilizing a cluster of commodity machines (nodes). In Hadoop, YARN got introduced for effective resource utilization among the jobs. Still, YARN over-allocates the resources for some tasks of a job and keeps the cluster resources underutilized. This paper has investigated the CAPACITY and FAIR schedulers' practical utilization of resources in a multi-tenancy shared environment using the HiBench benchmark suite. It compares the above MapReduce job schedulers' performance in two scenarios and proposes some open research questions (ORQ) with potential solutions to help the upcoming researchers. On average, the authors found that CAPACITY and FAIR schedulers utilize 77% of RAM and 82% of CPU cores. Finally, the experimental evaluation proves that these schedulers over-allocate the resources for some of the tasks and keep the cluster resources underutilized in different scenarios.
Subject
Computer Networks and Communications,Hardware and Architecture
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献