ITAP

Author:

Sadrosadati Mohammad1,Ehsani Seyed Borna1,Falahati Hajar2,Ausavarungnirun Rachata3,Tavakkol Arash4,Abaee Mojtaba2,Orosa Lois5,Wang Yaohua6,Sarbazi-Azad Hamid7,Mutlu Onur8

Affiliation:

1. Sharif University of Technology, Tehran, Iran

2. IPM

3. Carnegie Mellon University, KMUTNB

4. ETH Zürich

5. University of Campinas, ETH Zürich

6. ETH Zürich, National University of Defense Technology

7. Sharif University of Technology, IPM

8. ETH Zürich, Carnegie Mellon University

Abstract

Graphics Processing Units (GPUs) are widely used as the accelerator of choice for applications with massively data-parallel tasks. However, recent studies show that GPUs suffer heavily from resource underutilization, which, combined with their large static power consumption, imposes a significant power overhead. One of the most power-hungry components of a GPU—the execution units—frequently experience idleness when (1) an underutilized warp is issued to the execution units, leading to partial lane idleness, and (2) there is no active warp to be issued for the execution due to warp stalls (e.g., waiting for memory access and synchronization). Although large in total, the idle time of execution units actually comes from short but frequent stalls, leaving little potential for common power saving techniques, such as power-gating. In this article, we propose ITAP , a novel idle-time-aware power management technique, which aims to effectively reduce the static energy consumption of GPU execution units. By taking advantage of different power management techniques (i.e., power-gating and different levels of voltage scaling), ITAP employs three static power reduction modes with different overheads and capabilities of static power reduction. ITAP estimates the idle period length of execution units using prediction and peek-ahead techniques in a synergistic way and then applies the most appropriate static power reduction mode based on the estimated idle period length. We design ITAP to be power-aggressive or performance-aggressive, not both at the same time. Our experimental results on several workloads show that the power-aggressive design of ITAP outperforms the state-of-the-art solution by an average of 27.6% in terms of static energy savings, with less than 2.1% performance overhead. However, the performance-aggressive design of ITAP improves the static energy savings by an average of 16.9%, while keeping the GPU performance almost unaffected (i.e., up to 0.4% performance overhead) compared to the state-of-the-art static energy savings mechanism.

Funder

FAPESP

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Information Systems,Software

Cited by 20 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Cross-core Data Sharing for Energy-efficient GPUs;ACM Transactions on Architecture and Code Optimization;2024-09-14

2. Dynamic Thermal Management of 3D Memory through Rotating Low Power States and Partial Channel Closure;ACM Transactions on Embedded Computing Systems;2023-11-09

3. Snake: A Variable-length Chain-based Prefetching for GPUs;56th Annual IEEE/ACM International Symposium on Microarchitecture;2023-10-28

4. A Reinforcement Learning Approach for Performance-aware Reduction in Power Consumption of Data Center Compute Nodes;2023 IEEE International Conference on Cloud Engineering (IC2E);2023-09-25

5. TREFU: An Online Error Detecting and Correcting Fault Tolerant GPGPU Architecture;2023 IEEE 29th International Symposium on On-Line Testing and Robust System Design (IOLTS);2023-07-03

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3