Author:
Ajibade Lukuman Saheed,Abu Bakar Kamalrulnizam,Yusuf Muhammed Nura,Isyaku Babangida
Abstract
One of the most difficult issues in using MapReduce for parallelising and distributing large-scale data processing is detecting straggling tasks. It is defined as recognising processes that are operating on weak nodes. When two steps in the Map phase (copy, combine) and three stages in the Reduce phase (shuffle, sort, and reduce) are included, the overall execution time is the sum of the execution times of these five stages. The main objective of this study is to calculate the remaining time to complete a task, the time taken, and the straggler(s) detected in parallel execution. The suggested method is based on the use of Progress Score (PS), Progress Rate (PR), and Remaining Time (RT) metrics to detect straggling tasks. The results obtained have been compared with popular algorithms in this domain, such as Longest Approximate Time to End (LATE) and Combinatory Late-Machine (CLM), and it has been demonstrated to be capable of detecting straggling tasks, accurately estimating execution time, and supporting task acceleration. RMSTD outperforms LATE by 23.30% and CLM by 19.51%.
Publisher
Universiti Putra Malaysia