Exploiting Idle Hardware to Provide Low Overhead Fault Tolerance for VLIW Processors

Author:

Sartor Anderson L.1,Lorenzon Arthur F.1,Carro Luigi1,Kastensmidt Fernanda1,Wong Stephan2,Beck Antonio C. S.1

Affiliation:

1. Federal University of Rio Grande do Sul, Porto Alegre, Brazil

2. Delft University of Technology, Delft, The Netherlands

Abstract

Because of technology scaling, the soft error rate has been increasing in digital circuits, which affects system reliability. Therefore, modern processors, including VLIW architectures, must have means to mitigate such effects to guarantee reliable computing. In this scenario, our work proposes three low overhead fault tolerance approaches based on instruction duplication with zero latency detection, which uses a rollback mechanism to correct soft errors in the pipelanes of a configurable VLIW processor. The first uses idle issue slots within a period of time to execute extra instructions considering distinct application phases. The second works at a finer grain, adaptively exploiting idle functional units at run-time. However, some applications present high instruction-level parallelism (ILP), so the ability to provide fault tolerance is reduced: less functional units will be idle, decreasing the number of potential duplicated instructions. The third approach attacks this issue by dynamically reducing ILP according to a configurable threshold, increasing fault tolerance at the cost of performance. While the first two approaches achieve significant fault coverage with minimal area and power overhead for applications with low ILP, the latter improves fault tolerance with low performance degradation. All approaches are evaluated considering area, performance, power dissipation, and error coverage.

Publisher

Association for Computing Machinery (ACM)

Subject

Electrical and Electronic Engineering,Hardware and Architecture,Software

Cited by 9 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Dynamic fault-tolerant VLIW processor with heterogeneous Function Units;Microprocessors and Microsystems;2022-09

2. DYRE: a DYnamic REconfigurable solution to increase GPGPU’s reliability;The Journal of Supercomputing;2021-03-29

3. SoMMA: A software-managed memory architecture for multi-issue processors;Microprocessors and Microsystems;2020-09

4. Run-Time Coarse-Grained Hardware Mitigation for Multiple Faults on VLIW Processors;2019 Conference on Design and Architectures for Signal and Image Processing (DASIP);2019-10

5. A Knapsack Methodology for Hardware-based DMR Protection against Soft Errors in Superscalar Out-of-Order Processors;2019 IFIP/IEEE 27th International Conference on Very Large Scale Integration (VLSI-SoC);2019-10

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3