New Performance Modeling Methods for Parallel Data Processing Applications

Author:

Bhimani Janki1,Mi Ningfang1,Leeser Miriam1,Yang Zhengyu1

Affiliation:

1. Northeastern University, Boston, MA, USA

Abstract

Predicting the performance of an application running on parallel computing platforms is increasingly becoming important because of its influence on development time and resource management. However, predicting the performance with respect to parallel processes is complex for iterative and multi-stage applications. This research proposes a performance approximation approach FiM to predict the calculation time with FiM-Cal and communication time with FiM-Com of an application running on a distributed framework. FiM-Cal consists of two key components that are coupled with each other: (1) a Stochastic Markov Model to capture non-deterministic runtime that often depends on parallel resources, e.g., number of processes, and (2) a machine-learning model that extrapolates the parameters for calibrating our Markov model when we have changes in application parameters such as dataset. Along with the parallel calculation time, parallel computing platforms consume some data transfer time to communicate among different nodes. FiM-Com consists of a simulation queuing model to quickly estimate communication time. Our new modeling approach considers different design choices along multiple dimensions, namely (i) process-level parallelism, (ii) distribution of cores on multi-processor platform, (iii) application related parameters, and (iv) characteristics of datasets. The major contribution of our prediction approach is that FiM can provide an accurate prediction of parallel processing time for the datasets that have a much larger size than that of the training datasets. We evaluate our approach with NAS Parallel Benchmarks and real iterative data processing applications. We compare the predicted results (e.g., end-to-end execution time) with actual experimental measurements on a real distributed platform. We also compare our work with an existing prediction technique based on machine learning. We rank the number of processes according to the actual and predicted results from FiM and calculate the correlation between the actual and predicted rankings. Our results show that FiM obtains a high correlation in the range of 0.80 to 0.99, which indicates considerable accuracy of our technique. Such prediction provides data analysts a useful insight of optimal configuration of parallel resources (e.g., number of processes and number of cores) and also helps system designers to investigate the impact of changes in application parameters on system performance.

Funder

AFOSR

National Science Foundation

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Science Applications,Modeling and Simulation

Reference40 articles.

1. {n.d.}. Information Technology Services—Research Computing. Retrieved from https://www.northeastern.edu/rc/. {n.d.}. Information Technology Services—Research Computing. Retrieved from https://www.northeastern.edu/rc/.

2. {n.d.}. NASA Advanced Supercomputing Division NAS Parallel Benchmarks. Retrieved from http://www.nas.nasa.gov/publications/npb.html. {n.d.}. NASA Advanced Supercomputing Division NAS Parallel Benchmarks. Retrieved from http://www.nas.nasa.gov/publications/npb.html.

3. SimpleScalar: an infrastructure for computer system modeling

4. Performance Modeling: Understanding the Past and Predicting the Future

5. A regression-based approach to scalability prediction

Cited by 13 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Automatic Multi-Parameter Performance Modeling of HPC Applications on a New Sunway Supercomputer;IEEE Transactions on Parallel and Distributed Systems;2023-11

2. Context-aware Big Data Quality Assessment: A Scoping Review;Journal of Data and Information Quality;2023-08-22

3. A Performance Prediction Model for Structured Grid Based Applications in HPC Environments;2023 22nd International Symposium on Parallel and Distributed Computing (ISPDC);2023-07

4. Scheduling of Elastic Message Passing Applications on HPC Systems;Job Scheduling Strategies for Parallel Processing;2023

5. Theory of universal approach to improve predictive models using parallel data and application examples;PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON FRONTIER OF DIGITAL TECHNOLOGY TOWARDS A SUSTAINABLE SOCIETY;2023

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3