Online Task Resource Consumption Prediction for Scientific Workflows-Reference-Cited by-同舟云学术

Online Task Resource Consumption Prediction for Scientific Workflows

Published:2015-09 Issue:03 Volume:25 Page:1541003
ISSN:0129-6264
Container-title:Parallel Processing Letters
language:en
Short-container-title:Parallel Process. Lett.

Author:

da Silva Rafael Ferreira¹,Juve Gideon¹,Rynge Mats¹,Deelman Ewa¹,Livny Miron²

Affiliation:

1. University of Southern California, Information Sciences Institute, Marina Del Rey, CA, USA

2. University of Wisconsin Madison, Madison, WI, USA

Abstract

Estimates of task runtime, disk space usage, and memory consumption, are commonly used by scheduling and resource provisioning algorithms to support efficient and reliable workflow executions. Such algorithms often assume that accurate estimates are available, but such estimates are difficult to generate in practice. In this work, we first profile five real scientific workflows, collecting fine-grained information such as process I/O, runtime, memory usage, and CPU utilization. We then propose a method to automatically characterize workflow task requirements based on these profiles. Our method estimates task runtime, disk space, and peak memory consumption based on the size of the tasks’ input data. It looks for correlations between the parameters of a dataset, and if no correlation is found, the dataset is divided into smaller subsets using a clustering technique. Task estimates are generated based on the ratio parameter/input data size if they are correlated, or based on the probability distribution function of the parameter. We then propose an online estimation process based on the MAPE-K loop, where task executions are monitored and estimates are updated as more information becomes available. Experimental results show that our online estimation process results in much more accurate predictions than an offline approach, where all task requirements are estimated prior to workflow execution.

Publisher

World Scientific Pub Co Pte Lt

Subject

Hardware and Architecture,Theoretical Computer Science,Software

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0129626415410030

Reference39 articles.

1. NP-complete scheduling problems

2. Performance-effective and low-complexity task scheduling for heterogeneous computing

3. Cost-efficient task scheduling for executing large programs in the cloud

Cited by 42 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Application-Oriented Cloud Workload Prediction: A Survey and New Perspectives;Tsinghua Science and Technology;2025-02

2. Lotaru: Locally predicting workflow task runtimes for resource management on heterogeneous infrastructures;Future Generation Computer Systems;2024-01

3. Improving prediction of computational job execution times with machine learning;Concurrency and Computation: Practice and Experience;2023-09-12

4. A Cloud Broker for Executing Deadline-Constrained Periodic Scientific Workflows;IEEE Transactions on Services Computing;2023-09

5. A Cost-Efficient Workflow as a Service Broker Using On-demand and Spot Instances;Journal of Grid Computing;2023-07-08