Affiliation:
1. North Carolina State University
Abstract
Processor soft-error rates are projected to increase as feature sizes scale down, necessitating the adoption of reliability-enhancing techniques, but power and performance overhead remain a concern of such techniques. Dynamic cross-layer techniques are a promising way to improve the cost-effectiveness of resilient systems. As a foundation for making such a system, we propose a cross-layer approach for estimating the architectural vulnerability of a processor core online that works by combining information from software, compiler, and microarchitectural layers at runtime. The hardware layer combines the metadata from software and compiler layers with microarchitectural measurements to estimate architectural vulnerability online. We describe our design and evaluate it in detail on a set of SPEC CPU 2006 applications. We find that our online AVF estimate is highly accurate with respect to a postmortem AVF analysis, with only 0.46% average absolute error. Also, our design incurs negligible performance impact for SPEC2006 applications and about 1.2% for a Monte Carlo application, requires approximately 1.4% area overhead, and costs about 3.3% more power on average. We compare our technique against two prior online AVF estimation techniques, one using a linear regression to estimate AVF and another based on PVF-HVF; our evaluation finds that our approach, on average, is more accurate. Our case study of a Monte Carlo simulation shows that our AVF estimate can adapt to the inherent resiliency of the algorithm. Finally, we demonstrate the effectiveness of our approach using a dynamic protection scheme that limits vulnerability to soft errors while reducing the energy consumption by an average of 4.8%, and with a target normalized SER of 10%, compared to enabling a simple parity+ECC protection at all times.
Funder
National Science Foundation
Semiconductor Research Corporation
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Information Systems,Software
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Methods for Improving the Reliability of Intelligent Semiconductor;2021 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia);2021-11-01
2. Gem5Panalyzer: A Light-weight tool for Early-stage Architectural Reliability Evaluation & Prediction;2020 IEEE 63rd International Midwest Symposium on Circuits and Systems (MWSCAS);2020-08
3. Minotaur;Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems;2019-04-04