Performance optimization of computing task scheduling based on the Hadoop big data platform-Reference-Cited by-同舟云学术

Performance optimization of computing task scheduling based on the Hadoop big data platform

Published:2022-12-25 Issue: Volume: Page:
ISSN:0941-0643
Container-title:Neural Computing and Applications
language:en
Short-container-title:Neural Comput & Applic

Author:

Li Yang,Hei Xinhong^ORCID

Abstract

AbstractHadoop, a distributed computing framework that can efficiently process large-scale datasets, has been used by an increasing number of organizations as the basic computing framework to build cloud computing platforms. Improving its execution efficiency is a hot research direction in the industry, and the scheduling problem is a key factor affecting the execution efficiency of Hadoop. It is very important to identify its shortcomings and improve them. This paper examines and analyses the optimization of computing task scheduling performance based on the Hadoop big data platform. This paper first analyses Hadoop big data processing. Hadoop has high scalability. Computing nodes can be added at any time, and they can participate in cluster work through simple configuration. The paper discusses the improvement in the Hadoop resource scheduling algorithm. The task scheduling algorithm in the Hadoop-based data task localization proposed in this paper is compared with the default algorithm used in the Hadoop task scheduling algorithm. The former shows better local data in all four jobs, there are more data localization tasks, and the expected goal is achieved. The effectiveness of the algorithm is verified, and the performance is improved by 30%.

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Software

Link

https://link.springer.com/content/pdf/10.1007/s00521-022-08114-3.pdf

Reference24 articles.

1. Lu P, Zhu Z (2017) Data-oriented task scheduling in fixed- and flexible-grid multilayer inter-DC optical networks: a comparison study. J Lightwave Technol 35(24):5335–5346

2. Zheng W, Wu H, Nie C (2017) Integrating task scheduling and cache locking for multicore real-time embedded systems. ACM Sigplan Not 52(4):71–80

3. Ahmad S, Malik S, Kim DH (2018) Comparative analysis of simulation tools with visualization based on realtime task scheduling algorithms for IoT embedded applications. Int J Grid Distrib Comput 11(2):1–10

4. Li ZL, Li XJ, Sun W (2017) Task scheduling model and algorithm for agile satellite considering imaging quality. J Astronaut 38(6):590–597

5. Agarwal U (2017) Cloud computing BDaaS and HDaaS (big data as a service and Hadoop as a service). Int J Comput Sci Eng 5(11):131–134

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. High-speed parallel segmentation algorithms of MeanShift for litchi canopies based on Spark and Hadoop;International Journal of Modeling, Simulation, and Scientific Computing;2024-05-04

2. An Improved Heuristic Schedule Approach of Task Scheduling for Big Data processing System;2024 IEEE 9th International Conference for Convergence in Technology (I2CT);2024-04-05

3. Sports Prediction Model through Cloud Computing and Big Data Based on Artificial Intelligence Method;Journal of Intelligent Learning Systems and Applications;2024

4. MRAbF: MapReduce Resource Allocation Optimization Algorithm Based on Fair Policy;Proceedings of the 4th International Conference on Artificial Intelligence and Computer Engineering;2023-11-17

5. MapReduce scheduling algorithms in Hadoop: a systematic study;Journal of Cloud Computing;2023-10-10