SynchroTrace

Author:

Sangaiah Karthik1,Lui Michael1,Jagtap Radhika2,Diestelhorst Stephan2,Nilakantan Siddharth3,More Ankit4,Taskin Baris1,Hempstead Mark5

Affiliation:

1. Drexel University, Philadelphia, PA

2. ARM Ltd., Cambridge, UK

3. NVIDIA Corporation

4. Intel Corporation

5. Tufts University, Medford, MA

Abstract

Trace-driven simulation of chip multiprocessor (CMP) systems offers many advantages over execution-driven simulation, such as reducing simulation time and complexity, allowing portability, and scalability. However, trace-based simulation approaches have difficulty capturing and accurately replaying multithreaded traces due to the inherent nondeterminism in the execution of multithreaded programs. In this work, we present SynchroTrace, a scalable, flexible, and accurate trace-based multithreaded simulation methodology. By recording synchronization events relevant to modern threading libraries (e.g., Pthreads and OpenMP) and dependencies in the traces, independent of the host architecture, the methodology is able to accurately model the nondeterminism of multithreaded programs for different hardware platforms and threading paradigms. Through capturing high-level instruction categories, the SynchroTrace average CPI trace Replay timing model offers fast and accurate simulation of many-core in-order CMPs. We perform two case studies to validate the SynchroTrace simulation flow against the gem5 full-system simulator: (1) a constraint-based design space exploration with traditional CMP benchmarks and (2) a thread-scalability study with HPC-representative applications. The results from these case studies show that (1) our trace-based approach with trace filtering has a peak speedup of up to 18.7× over simulation in gem5 full-system with an average of 9.6× speedup, (2) SynchroTrace maintains the thread-scaling accuracy of gem5 and can efficiently scale up to 64 threads, and (3) SynchroTrace can trace in one platform and model any platform in early stages of design.

Funder

National Science Foundation, including CAREER

NSF Graduate Research Fellowship

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Information Systems,Software

Reference38 articles.

1. C. Bienia. 2011. Benchmarking Modern Multiprocessors. Ph.D. Dissertation. Princeton University Princeton NJ. C. Bienia. 2011. Benchmarking Modern Multiprocessors. Ph.D. Dissertation. Princeton University Princeton NJ.

2. The gem5 simulator

Cited by 9 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Āpta: Fault-tolerant object-granular CXL disaggregated memory for accelerating FaaS;2023 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN);2023-06

2. Distributed Effect Evaluation Algorithm of Computer English Online Platform based on Hibernate Task-based Data Architecture;2022 6th International Conference on Intelligent Computing and Control Systems (ICICCS);2022-05-25

3. Parallel I/O Evaluation Techniques and Emerging HPC Workloads: A Perspective;2021 IEEE International Conference on Cluster Computing (CLUSTER);2021-09

4. Dvé: Improving DRAM Reliability and Performance On-Demand via Coherent Replication;2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA);2021-06

5. Negative Perceptions About the Applicability of Source-to-Source Compilers in HPC: A Literature Review;Lecture Notes in Computer Science;2021

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3