Affiliation:
1. East China Normal University, China
Abstract
Collecting sufficient microarchitecture performance data is essential for performance evaluation and workload characterization. There are many events to be monitored in a modern processor while only a few hardware performance monitoring counters (PMCs) can be used, so multiplexing is commonly adopted. However, inefficiency commonly exists in state-of-the-art profiling tools when grouping events for multiplexing PMCs. It has the risk of inaccurate measurement and misleading analysis. Commercial tools can leverage PMCs, but they are closed source and only support their specified platforms. To this end, we propose an approach for efficient cross-platform microarchitecture performance measurement via adaptive grouping, aiming to improve the metrics’ sampling ratios. The approach generates event groups based on the number of available PMCs detected on arbitrary machines while avoiding the scheduling pitfall of Linux perf_event subsystem. We evaluate our approach with SPEC CPU 2017 on four mainstream x86-64 and AArch64 processors and conduct comparative analyses of efficiency with two other state-of-the-art tools, LIKWID and ARM Top-down Tool. The experimental results indicate that our approach gains around 50% improvement in the average sampling ratio of metrics without compromising the correctness and reliability.
Funder
National Natural Science Foundation of China
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Information Systems,Software
Reference54 articles.
1. Advanced Micro Devices Inc.2023. AMD64 Architecture Programmer’s Manual Volumes 1-5. Retrieved June 18 2023 from https://www.amd.com/en/support/tech-docs/amd64-architecture-programmers-manual-volumes-1-5
2. Reza Azimi, Michael Stumm, and Robert W. Wisniewski. 2005. Online performance analysis by statistical sampling of microprocessor performance counters. In Proceedings of the 19th Annual International Conference on Supercomputing. Association for Computing Machinery, New York, NY, 101–110. DOI:10.1145/1088149.1088163
3. Denis Bakhvalov. 2020. Performance Analysis and Tuning on Modern CPUs. Retrieved fromhttps://faculty.cs.niu.edu/winans/notes/patmc.pdf
4. Subho S. Banerjee, Saurabh Jha, Zbigniew Kalbarczyk, and Ravishankar K. Iyer. 2021. BayesPerf: Minimizing performance monitoring errors using Bayesian statistics. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems. Association for Computing Machinery, New York, NY, 832–844. DOI:10.1145/3445814.3446739
5. A Portable Programming Interface for Performance Evaluation on Modern Processors