Affiliation:
1. Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
2. NVIDIA Corporation, Sunnyvale CA, USA
Abstract
Process, Voltage, and Temperature (PVT) variations can significantly degrade the performance benefits expected from next nanoscale technology. The primary circuit implication of the PVT variations is the resultant timing emergencies. In a multi-core processor running multiple programs, variations create spatial and temporal unbalance across the processing cores. Most prior schemes are dedicated to tolerating PVT variations individually for a single core, but ignore the opportunity of leveraging the complementary effects between variations and the intrinsic variation unbalance among individual cores. We find that the notorious delay impacts from different variations are not necessary aggregated. Cores with mild variations can share the violent workload from cores suffering large variations. If operated correctly, variations on different cores can help mitigating each other and result in a variation-mild environment. In this paper, we propose Timing Emergency Aware Thread Migration (TEA-TM), a delay sensor-based scheme to reduce system timing emergencies under PVT variations. Fourier transform and frequency domain analysis are conducted to provide the insights and the potential of the PVT co-optimization scheme. Experimental results show on average TEA-TM can help save up to 24% throughput loss, at the same time improve the system fairness by 85%.
Publisher
Association for Computing Machinery (ACM)
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Fault-Tolerant General Purposed Processors;Built-in Fault-Tolerant Computing Paradigm for Resilient Large-Scale Chip Design;2023
2. A Survey of Architectural Techniques for Managing Process Variation;ACM Computing Surveys;2016-05-02
3. CoreRank: Redeeming “Sick Silicon” by Dynamically Quantifying Core-Level Healthy Condition;IEEE Transactions on Computers;2016-03-01
4. Orchestrator: Guarding Against Voltage Emergencies in Multithreaded Applications;IEEE Transactions on Very Large Scale Integration (VLSI) Systems;2014-12
5. Globally precise-restartable execution of parallel programs;Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation;2014-06-09