Affiliation:
1. Washington State University, Pullman, WA
2. Intel Research Labs, Hillsboro, OR
Abstract
The complexity of manycore System-on-chips (SoCs) is growing faster than our ability to manage them to reduce the overall energy consumption. Further, as SoC design moves toward three-dimensional (3D) architectures, the core's power density increases leading to unacceptable high peak chip temperatures. In this article, we consider the optimization problem of dynamic power management (DPM) in manycore SoCs for an allowable performance penalty (say, 5%) and admissible peak chip temperature. We employ a machine learning– (ML) based DPM policy, which selects the voltage/frequency levels for different cluster of cores as a function of the application workload features such as core computation and inter-core traffic, and so on. We propose a novel learning-to-search (L2S) framework to automatically identify an optimized sequence of DPM decisions from a large combinatorial space for joint energy-thermal optimization for one or more given applications. The optimized DPM decisions are given to a supervised learning algorithm to train a DPM policy, which mimics the corresponding decision-making behavior. Our experiments on two different manycore architectures designed using wireless interconnect and monolithic 3D demonstrate that principles behind the L2S framework are applicable for more than one configuration. Moreover, L2S-based DPM policies achieve up to 30% energy-delay product savings and reduce the peak chip temperature by up to 17 °C compared to the state-of-the-art ML methods for an allowable performance overhead of only 5%.
Publisher
Association for Computing Machinery (ACM)
Subject
Electrical and Electronic Engineering,Computer Graphics and Computer-Aided Design,Computer Science Applications
Reference45 articles.
1. A. Aalsaud, R. Shafik, A. Rafiev, F. Xia, S. Yang, and A. Yakovlev. 2016. Power-aware performance adaptation of concurrent applications in heterogeneous many-core systems. In Proceedings of the International Symposium on Low Power Electronics and Design. 368–373.
2. Application and Thermal-reliability-aware Reinforcement Learning Based Multi-core Power Management
3. Design and management of voltage-frequency island partitioned networks-on-chip;Ogras U.;IEEE Trans. VLSI Syst.,2019
4. HiLITE: Hierarchical and Lightweight Imitation Learning for Power Management of Embedded SoCs
5. The EDA challenges in the dark silicon era: Temperature, reliability, and variability perspectives;Shafique M.;Proceedings of the 51st Annual Design Automation Conference,2014
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献