Affiliation:
1. Ghent University, Belgium
Abstract
Analyzing multi-threaded programs is quite challenging, but is necessary to obtain good multicore performance while saving energy. Due to synchronization, certain threads make others wait, because they hold a lock or have yet to reach a barrier. We call these
critical threads
, i.e., threads whose performance is determinative of program performance as a whole. Identifying these threads can reveal numerous optimization opportunities, for the software developer and for hardware.
In this paper, we propose a new metric for assessing thread criticality, which combines both how much time a thread is performing useful work and how many co-running threads are waiting. We show how thread criticality can be calculated online with modest hardware additions and with low overhead. We use our metric to create
criticality stacks
that break total execution time into each thread's criticality component, allowing for easy visual analysis of parallel imbalance.
To validate our criticality metric, and demonstrate it is better than previous metrics, we scale the frequency of the most critical thread and show it achieves the largest performance improvement. We then demonstrate the broad applicability of criticality stacks by using them to perform three types of optimizations: (1) program analysis to remove parallel bottlenecks, (2) dynamically identifying the most critical thread and accelerating it using frequency scaling to improve performance, and (3) showing that accelerating only the most critical thread allows for targeted energy reduction.
Funder
Fonds Wetenschappelijk Onderzoek
FWO
European Research Council
Universiteit Gent
Seventh Framework Programme
Publisher
Association for Computing Machinery (ACM)
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Advance Virtual Channel Reservation;IEEE Transactions on Computers;2020-09-01
2. Representative paths analysis;Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis;2017-11-12
3. RMC;Proceedings of the 13th International Conference on Embedded Software;2016-10
4. Synergistic timing speculation for multi-threaded programs;Proceedings of the 53rd Annual Design Automation Conference;2016-06-05