Affiliation:
1. Tsinghua University, Beijing, China
Abstract
For designers of large-scale parallel computers, it is greatly desired that performance of parallel applications can be predicted at the design phase. However, this is difficult because the execution time of parallel applications is determined by several factors, including sequential computation time in each process, communication time and their convolution. Despite previous efforts, it remains an open problem to estimate sequential computation time in each process accurately and efficiently for large-scale parallel applications on non-existing target machines.
This paper proposes a novel approach to predict the sequential computation time accurately and efficiently. We assume that there is at least one node of the target platform but the whole target system need not be available. We make two main technical contributions. First, we employ deterministic replay techniques to execute any process of a parallel application on a single node at real speed. As a result, we can simply measure the real sequential computation time on a target node for each process one by one. Second, we observe that computation behavior of processes in parallel applications can be clustered into a few groups while processes in each group have similar computation behavior. This observation helps us reduce measurement time significantly because we only need to execute representative parallel processes instead of all of them.
We have implemented a performance prediction framework, called PHANTOM, which integrates the above computation-time acquisition approach with a trace-driven network simulator. We validate our approach on several platforms. For ASCI Sweep3D, the error of our approach is less than 5% on 1024 processor cores. Compared to a recent regression-based prediction approach, PHANTOM presents better prediction accuracy across different platforms.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design,Software
Cited by
62 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. SimMSG: Simulating Transportation of MPI Messages in High Performance Computing Systems;2023 IEEE International Conference on High Performance Computing & Communications, Data Science & Systems, Smart City & Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys);2023-12-17
2. Optimizing Logging and Monitoring in Heterogeneous Cloud Environments for IoT and Edge Applications;IEEE Internet of Things Journal;2023-12-15
3. Leveraging simulation of high performance computing systems with node simulation using architecture simulator;CCF Transactions on High Performance Computing;2023-11-13
4. Dependency-Driven Interconnection Network Simulation Using MPI Traces;2023 International Conference on Ubiquitous Communication (Ucom);2023-07-07
5. A particle-based parallel scheme for material point method (MPM) using message passing interface (MPI);Computational Particle Mechanics;2022-05-17