TAO: Re-Thinking DL-based Microarchitecture Simulation-Reference-Cited by-同舟云学术

TAO: Re-Thinking DL-based Microarchitecture Simulation

Published:2024-06-11 Issue:1 Volume:52 Page:23-24
ISSN:0163-5999
Container-title:ACM SIGMETRICS Performance Evaluation Review
language:en
Short-container-title:SIGMETRICS Perform. Eval. Rev.

Author:

Pandey Santosh¹^ORCID,Yazdanbakhsh Amir²^ORCID,Liu Hang¹^ORCID

Affiliation:

1. Rutgers University, New Brunswick, NJ, USA

2. Google DeepMind, Mountain View, CA, USA

Abstract

Microarchitecture simulators are indispensable tools for microarchitecture designers to validate, estimate, and optimize new hardware that meets specific design requirements. While the quest for a fast, accurate and detailed microarchitecture simulation has been ongoing for decades, existing simulators excel and fall short at different aspects: (i) Although execution-driven simulation is accurate and detailed, it is extremely slow and requires expert-level experience to design. (ii) Trace-driven simulation reuses the execution traces in pursuit of fast simulation but faces accuracy concerns and fails to achieve significant speedup. (iii) Emerging deep learning (DL)-based simulations are remarkably fast and have acceptable accuracy, but introduce substantial overheads from trace regeneration and model re-training when simulating a new microarchitecture. Re-thinking the advantages and limitations of the aforementioned three mainstream simulation paradigms, this paper introduces TAO that redesigns the DL-based simulation with three primary contributions: First, we propose a new training dataset design such that the subsequent simulation (i.e., inference) only needs functional trace as inputs, which can be rapidly generated and reused across microarchitectures. Second, to increase the detail of the simulation, we redesign the input features and the DL model using self-attention to support predicting various performance metrics of interest. Third, we propose techniques to train a microarchitecture agnostic embedding layer that enables fast transfer learning between different microarchitectural configurations and effectively reduces the re-training overhead of conventional DL-based simulators. TAO can predict various performance metrics of interest, significantly reduce the simulation time, and maintain similar simulation accuracy as state-of-the-art DL-based endeavors.

Funder

NSF

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3673660.3655085

Reference6 articles.

1. Nathan Binkert Bradford Beckmann Gabriel Black Steven K Reinhardt Ali Saidi Arkaprava Basu Joel Hestness Derek R Hower Tushar Krishna Somayeh Sardashti et al. 2011. The gem5 simulator. ACM SIGARCH computer architecture news Vol. 39 2 (2011) 1--7.

2. SimNet

3. Power Modeling for GPU Architectures Using McPAT

4. S. Pandey L. Li T. Flynn A. Hoisie and H. Liu. 2022. Scalable Deep Learning-Based Microarchitecture Simulation on GPUs. In 2022 SC22: International Conference for High Performance Computing Networking Storage and Analysis (SC) (SC). IEEE Computer Society Los Alamitos CA USA 1138--1152. https://doi.ieeecomputersociety.org/

5. TAO: Re-Thinking DL-based Microarchitecture Simulation