Software-Based Selective Validation Techniques for Robust CGRAs Against Soft Errors

Author:

Ko Yohan1,Kang Jihoon2,Lee Jongwon3,Kim Yongjoo4,Kim Joonhyun1,So Hwisoo1,Lee Kyoungwoo1,Paek Yunheung3

Affiliation:

1. Yonsei University

2. LIG Nex1 Co.Ltd

3. Seoul National University

4. Electronics and Telecommunications Research Institute

Abstract

Coarse-Grained Reconfigurable Architectures (CGRAs) are drawing significant attention since they promise both performances with parallelism and flexibility with reconfiguration. Soft errors (or transient faults) are becoming a serious design concern in embedded systems including CGRAs since the soft error rate is increasing exponentially as technology is scaling. A recently proposed software-based technique with TMR (Triple Modular Redundancy) implemented on CGRAs incurs extreme overheads in terms of runtime and energy consumption mainly due to expensive voting mechanisms for the outputs from the triplication of every operation. In this article, we propose selective validation mechanisms for efficient modular redundancy techniques in the datapaths on CGRAs. Our techniques selectively validate the results at synchronous operations rather than every operation in order to reduce the expensive performance overhead from the validation mechanism. We also present an optimization technique to further improve the runtime and the energy consumption by minimizing synchronous operations where a validating mechanism needs to be applied. Our experimental results demonstrate that our selective validation-based TMR technique with our optimization on CGRAs can improve the runtime by 41.0% and the energy consumption by 26.2% on average over benchmarks as compared to the recently proposed software-based TMR technique with the full validation.

Funder

Development on the SW/HW Modules of Processor Monitor for System Intrusion Detection

IITP

Institute for Information & Communications Technology Promotion (IITP) grant funded by the Korean government

Korean government

Brain Korea Plus Project in 2015

Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science; ICT and Future Planning

National Research Foundation of Korea (NRF) grant funded by the Korean government

the MSIP, Korea, under the ITRC (Information Technology Research Center) support program

Next-Generation Information Computing Development Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science; ICT and Future Planning

The Core Technology Development of SW-SoC Convergence Platform for Hyper-Connection Services Among Smart Devices Based on Heterogeneous Multi-core Clusters

MSIP (Ministry of Science, ICT and Future Planning) under the Research Project on High Performance and Scalable Manycore Operating System

ICT at Seoul National University

Inter-University Semiconductor Research Center

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Software

Reference34 articles.

1. Coarse-grained dynamically reconfigurable architecture with flexible reliability

2. R. Baumann. 2005. Soft errors in advanced computer systems. Design and Test of Computers (2005). 10.1109/MDT.2005.69 R. Baumann. 2005. Soft errors in advanced computer systems. Design and Test of Computers (2005). 10.1109/MDT.2005.69

3. G. Bradski. 2000. The OpenCV library. Doctor Dobbs Journal (2000). G. Bradski. 2000. The OpenCV library. Doctor Dobbs Journal (2000).

4. Automatic Instruction-Level Software-Only Recovery

5. Spatial and Temporal Data Path Remapping for Fault-Tolerant Coarse-Grained Reconfigurable Architectures

Cited by 4 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. M2STaR: A Multimode Spatio-Temporal Redundancy Design for Fault-Tolerant Coarse-Grained Reconfigurable Architectures;IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems;2023-09

2. Root cause analysis of soft-error-induced failures from hardware and software perspectives;Journal of Systems Architecture;2022-09

3. Survey of Software-Implemented Soft Error Protection;Electronics;2022-02-03

4. Protecting Caches from Soft Errors;ACM Transactions on Embedded Computing Systems;2017-11-30

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3