Dynamic performance tuning for speculative threads-Reference-Cited by-同舟云学术

Dynamic performance tuning for speculative threads

Published:2009-06-15 Issue:3 Volume:37 Page:462-473
ISSN:0163-5964
Container-title:ACM SIGARCH Computer Architecture News
language:en
Short-container-title:SIGARCH Comput. Archit. News

Author:

Luo Yangchun¹,Packirisamy Venkatesan¹,Hsu Wei-Chung¹,Zhai Antonia¹,Mungre Nikhil¹,Tarkas Ankit¹

Affiliation:

1. University of Minnesota - Twin Cities, Minneapolis, MN, USA

Abstract

In response to the emergence of multicore processors, various novel and sophisticated execution models have been introduced to fully utilize these processors. One such execution model is Thread-Level Speculation (TLS), which allows potentially dependent threads to execute speculatively in parallel. While TLS offers significant performance potential for applications that are otherwise non-parallel, extracting efficient speculative threads in the presence of complex control flow and ambiguous data dependences is a real challenge. This task is further complicated by the fact that the performance of speculative threads is often architecture-dependent, input-sensitive, and exhibits phase behaviors. Thus we propose dynamic performance tuning mechanisms that determine where and how to create speculative threads at runtime. This paper describes the design, implementation, and evaluation of hardware and software support that takes advantage of runtime performance profiles to extract efficient speculative threads. In our proposed framework, speculative threads are monitored by hardware-based performance counters and their performance impact is estimated. The creation of speculative threads is adjusted based on the estimation. This paper proposes speculative threads performance estimation techniques, that are capable of correctly determining whether speculation can improve performance for loops that corresponds to 83.8% of total loop execution time across all benchmarks. This paper also examines several dynamic performance tuning policies and finds that the best tuning policy achieves an overall speedup of 36.8%on a set of benchmarks from SPEC2000 suite, which outperforms static thread management by 9.5%.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/1555815.1555812

Reference49 articles.

1. The SimpleScalar tool set, version 2.0

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. DynaSprint;Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture;2019-10-12

2. Dynamic Core Allocation for Energy-Efficient Thread-Level Speculation;2014 IEEE 17th International Conference on Computational Science and Engineering;2014-12

3. A Static Greedy and Dynamic Adaptive Thread Spawning Approach for Loop-Level Parallelism;Journal of Computer Science and Technology;2014-11

4. A Dynamically Adaptive Approach for Speculative Loop Execution in SMT Architectures;2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS);2014-08

5. The BonaFide C Analyzer: automatic loop-level characterization and coverage measurement;The Journal of Supercomputing;2014-01-21