Multilevel Phase Analysis

Author:

Zhang Weihua1,Li Jiaxin2,Li Yi2,Chen Haibo3

Affiliation:

1. Shanghai Key Laboratory of Data Science, Software School, and State Key Laboratory of ASIC & System, Fudan University, Shanghai, China

2. Parallel Processing Institute, Fudan University, Shanghai, China

3. School of Software, Shanghai Jiao Tong University, Shanghai, China

Abstract

Phase analysis, which classifies the set of execution intervals with similar execution behavior and resource requirements, has been widely used in a variety of systems, including dynamic cache reconfiguration, prefetching, race detection, and sampling simulation. Although phase granularity has been a major factor in the accuracy of phase analysis, it has not been well investigated, and most systems usually adopt a fine-grained scheme. However, such a scheme can only take account of recent local phase information and could be frequently interfered by temporary noise due to instant phase changes, which might notably limit the accuracy. In this article, we make the first investigation on the potential of multilevel phase analysis (MLPA), where different granularity phase analyses are combined together to improve the overall accuracy. The key observation is that the coarse-grained intervals belonging to the same phase usually consist of stably distributed fine-grained phases. Moreover, the phase of a coarse-grained interval can be accurately identified based on the fine-grained intervals at the beginning of its execution. Based on the observation, we design and implement an MLPA scheme. In such a scheme, a coarse-grained phase is first identified based on the fine-grained intervals at the beginning of its execution. The following fine-grained phases in it are then predicted based on the sequence of fine-grained phases in the coarse-grained phase. Experimental results show that such a scheme can notably improve the prediction accuracy. Using a Markov fine-grained phase predictor as the baseline, MLPA can improve prediction accuracy by 20%, 39%, and 29% for next phase, phase change, and phase length prediction for SPEC2000, respectively, yet incur only about 2% time overhead and 40% space overhead (about 360 bytes in total). To demonstrate the effectiveness of MLPA, we apply it to a dynamic cache reconfiguration system that dynamically adjusts the cache size to reduce the power consumption and access time of the data cache. Experimental results show that MLPA can further reduce the average cache size by 15% compared to the fine-grained scheme. Moreover, for MLPA, we also observe that coarse-grained phases can better capture the overall program characteristics with fewer of phases and the last representative phase could be classified in a very early program position, leading to fewer execution internals being functionally simulated. Based on this observation, we also design a multilevel sampling simulation technique that combines both fine- and coarse-grained phase analysis for sampling simulation. Such a scheme uses fine-grained simulation points to represent only the selected coarse-grained simulation points instead of the entire program execution; thus, it could further reduce both the functional and detailed simulation time. Experimental results show that MLPA for sampling simulation can achieve a speedup in simulation time of about 8.3X with similar accuracy compared to 10M SimPoint.

Funder

National 863 Program of China

China National Natural Science Foundation

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Software

Cited by 10 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Hardware Prefetching Tuning Method Based on Program Phase Behavior;Journal of Circuits, Systems and Computers;2023-11-24

2. SCFM: A Statistical Coarse-to-Fine Method to Select Cross-Microarchitecture Reliable Simulation Points;Lecture Notes in Computer Science;2023-11-08

3. Learning-based Phase-aware Multi-core CPU Workload Forecasting;ACM Transactions on Design Automation of Electronic Systems;2022-12-24

4. Pop-Crypt: Identification and Management of Popular Words for Enhancing Lifetime of EnCrypted Nonvolatile Main Memories;IEEE Transactions on Very Large Scale Integration (VLSI) Systems;2022-09

5. ML for System-Level Modeling;Machine Learning Applications in Electronic Design Automation;2022

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3