Optimizing Hadoop Scheduling in Single-Board-Computer-Based Heterogeneous Clusters

Author:

Qureshi Basit1ORCID

Affiliation:

1. Department of Computer Science, Prince Sultan University, Riyadh 11586, Saudi Arabia

Abstract

Single-board computers (SBCs) are emerging as an efficient and economical solution for fog and edge computing, providing localized big data processing with lower energy consumption. Newer and faster SBCs deliver improved performance while still maintaining a compact form factor and cost-effectiveness. In recent times, researchers have addressed scheduling issues in Hadoop-based SBC clusters. Despite their potential, traditional Hadoop configurations struggle to optimize performance in heterogeneous SBC clusters due to disparities in computing resources. Consequently, we propose modifications to the scheduling mechanism to address these challenges. In this paper, we leverage the use of node labels introduced in Hadoop 3+ and define a Frugality Index that categorizes and labels SBC nodes based on their physical capabilities, such as CPU, memory, disk space, etc. Next, an adaptive configuration policy modifies the native fair scheduling policy by dynamically adjusting resource allocation in response to workload and cluster conditions. Furthermore, the proposed frugal configuration policy considers prioritizing the reduced tasks based on the Frugality Index to maximize parallelism. To evaluate our proposal, we construct a 13-node SBC cluster and conduct empirical evaluation using the Hadoop CPU and IO intensive microbenchmarks. The results demonstrate significant performance improvements compared to native Hadoop FIFO and capacity schedulers, with execution times 56% and 22% faster than the best_cap and best_fifo scenarios. Our findings underscore the effectiveness of our approach in managing the heterogeneous nature of SBC clusters and optimizing performance across various hardware configurations.

Funder

Prince Sultan University

Publisher

MDPI AG

Reference31 articles.

1. Serverless-like Platform for Container-Based YARN Clusters;Enes;Future Gener. Comput. Syst.,2024

2. Warade, M., Schneider, J.-G., and Lee, K. (2022). Measuring the Energy and Performance of Scientific Workflows on Low-Power Clusters. Electronics, 11.

3. Commodity Single Board Computer Clusters and Their Applications;Johnston;Future Gener. Comput. Syst.,2018

4. An Efficient Implementation of Mobile Raspberry Pi Hadoop Clusters for Robust and Augmented Computing Performance;Srinivasan;J. Inf. Process. Syst.,2018

5. The Development of a Low-Cost Big Data Cluster Using Apache Hadoop and Raspberry Pi. A Complete Guide;Neto;Comput. Electr. Eng.,2022

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3