A Systematic Literature Review on Hardware Reliability Assessment Methods for Deep Neural Networks

Author:

Ahmadilivani Mohammad Hasan1ORCID,Taheri Mahdi1ORCID,Raik Jaan1ORCID,Daneshtalab Masoud2ORCID,Jenihhin Maksim1ORCID

Affiliation:

1. Tallinn University of Technology, Estonia

2. Mälardalen University, Sweden and Tallinn University of Technology, Estonia

Abstract

Artificial Intelligence (AI) and, in particular, Machine Learning (ML), have emerged to be utilized in various applications due to their capability to learn how to solve complex problems. Over the past decade, rapid advances in ML have presented Deep Neural Networks (DNNs) consisting of a large number of neurons and layers. DNN Hardware Accelerators (DHAs) are leveraged to deploy DNNs in the target applications. Safety-critical applications, where hardware faults/errors would result in catastrophic consequences, also benefit from DHAs. Therefore, the reliability of DNNs is an essential subject of research. In recent years, several studies have been published accordingly to assess the reliability of DNNs. In this regard, various reliability assessment methods have been proposed on a variety of platforms and applications. Hence, there is a need to summarize the state-of-the-art to identify the gaps in the study of the reliability of DNNs. In this work, we conduct a Systematic Literature Review (SLR) on the reliability assessment methods of DNNs to collect relevant research works as much as possible, present a categorization of them, and address the open challenges. Through this SLR, three kinds of methods for reliability assessment of DNNs are identified, including Fault Injection (FI), Analytical, and Hybrid methods. Since the majority of works assess the DNN reliability by FI, we characterize different approaches and platforms of the FI method comprehensively. Moreover, Analytical and Hybrid methods are propounded. Thus, different reliability assessment methods for DNNs have been elaborated on their conducted DNN platforms and reliability evaluation metrics. Finally, we highlight the advantages and disadvantages of the identified methods and address the open challenges in the research area. We have concluded that Analytical and Hybrid methods are light-weight yet sufficiently accurate and have the potential to be extended in future research and to be utilized in establishing novel DNN reliability assessment frameworks.

Funder

Information and Communication Technologies (ICT) programme

Estonian Research Council

Estonian-French science and technology cooperation programme PARROT

Swedish Innovation Agency VINNOVA project SafeDeep

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science,Theoretical Computer Science

Reference201 articles.

1. Alberto Bosio, Ian O’Connor, Marcello Traiola, Jorge Echavarria, Jürgen Teich, Muhammad Abdullah Hanif, Muhammad Shafique, Said Hamdioui, Bastien Deveautour, Patrick Girard, Arnaud Virazel, and Koen Bertels. 2021. Emerging computing devices: Challenges and opportunities for test and reliability. In IEEE European Test Symposium (ETS’21). IEEE, 1–10.

2. Håkan Forsberg, Joakim Lindén, Johan Hjorth, Torbjörn Månefjord, and Masoud Daneshtalab. 2020. Challenges in using neural networks in safety-critical applications. In AIAA/IEEE 39th Digital Avionics Systems Conference (DASC’20). IEEE, 1–7.

3. Alessandra Nardi and Antonino Armato. 2017. Functional safety methodologies for automotive applications. In IEEE/ACM International Conference on Computer-Aided Design (ICCAD’17). IEEE, 970–975.

4. Soft errors in DNN accelerators: A comprehensive review;Ibrahim Younis;Microelectron. Reliab.,2020

5. Robust machine learning systems: Challenges, current trends, perspectives, and the road ahead;Shafique Muhammad;IEEE Des. Test,2020

Cited by 9 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Evaluating single event upsets in deep neural networks for semantic segmentation: An embedded system perspective;Journal of Systems Architecture;2024-09

2. Cost-Effective Fault Tolerance for CNNs Using Parameter Vulnerability Based Hardening and Pruning;2024 IEEE 30th International Symposium on On-Line Testing and Robust System Design (IOLTS);2024-07-03

3. AdAM: Adaptive Fault-Tolerant Approximate Multiplier for Edge DNN Accelerators;2024 IEEE European Test Symposium (ETS);2024-05-20

4. Special Session: Reliability Assessment Recipes for DNN Accelerators;2024 IEEE 42nd VLSI Test Symposium (VTS);2024-04-22

5. Keynote: Cost-Efficient Reliability for Edge-AI Chips;2024 IEEE 25th Latin American Test Symposium (LATS);2024-04-09

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3