Compiler optimization of scalar value communication between speculative threads-Reference-Cited by-同舟云学术

Compiler optimization of scalar value communication between speculative threads

Published:2002-10 Issue:10 Volume:37 Page:171-183
ISSN:0362-1340
Container-title:ACM SIGPLAN Notices
language:en
Short-container-title:SIGPLAN Not.

Author:

Zhai Antonia¹,Colohan Christopher B.¹,Steffan J. Gregory¹,Mowry Todd C.¹

Affiliation:

1. Carnegie Mellon University, Pittsburgh, PA

Abstract

While there have been many recent proposals for hardware that supports Thread-Level Speculation (TLS), there has been relatively little work on compiler optimizations to fully exploit this potential for parallelizing programs optimistically. In this paper, we focus on one important limitation of program performance under TLS, which is stalls due to forwarding scalar values between threads that would otherwise cause frequent data dependences. We present and evaluate dataflow algorithms for three increasingly-aggressive instruction scheduling techniques that reduce the critical forwarding path introduced by the synchronization associated with this data forwarding. In addition, we contrast our compiler techniques with related hardware-only approaches. With our most aggressive compiler and hardware techniques, we improve performance under TLS by 6.2-28.5% for 6 of 14 applications, and by at least 2.7% for half of the other applications.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Graphics and Computer-Aided Design,Software

Link

https://dl.acm.org/doi/pdf/10.1145/605432.605416

Reference37 articles.

1. Improving data-flow analysis with path profiles

2. BROADCOM CORPORATION. The Sibyte SB-1250 Processor. http://www.sibyte.com/mercurian.]] BROADCOM CORPORATION. The Sibyte SB-1250 Processor. http://www.sibyte.com/mercurian.]]

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Performance Estimation of Task Graphs Based on Path Profiling;International Journal of Parallel Programming;2015-07-23

2. Dynamic Core Allocation for Energy-Efficient Thread-Level Speculation;2014 IEEE 17th International Conference on Computational Science and Engineering;2014-12

3. A Dynamically Adaptive Approach for Speculative Loop Execution in SMT Architectures;2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS);2014-08

4. The design and implementation of heterogeneous multicore systems for energy-efficient speculative thread execution;ACM Transactions on Architecture and Code Optimization;2013-12

5. Disjoint out-of-order execution processor;ACM Transactions on Architecture and Code Optimization;2012-09