Tolerating Defects in Low-Power Neural Network Accelerators Via Retraining-Free Weight Approximation-Reference-Cited by-同舟云学术

Tolerating Defects in Low-Power Neural Network Accelerators Via Retraining-Free Weight Approximation

Published:2021-10-31 Issue:5s Volume:20 Page:1-21
ISSN:1539-9087
Container-title:ACM Transactions on Embedded Computing Systems
language:en
Short-container-title:ACM Trans. Embed. Comput. Syst.

Author:

Hosseini Fateme S.¹,Meng Fanruo¹,Yang Chengmo¹,Wen Wujie²,Cammarota Rosario³

Affiliation:

1. University of Delaware, Newark, USA

2. Lehigh University, Bethlehem, USA

3. Intel, San Jose, USA

Abstract

Hardware accelerators are essential to the accommodation of ever-increasing Deep Neural Network (DNN) workloads on the resource-constrained embedded devices. While accelerators facilitate fast and energy-efficient DNN operations, their accuracy is threatened by faults in their on-chip and off-chip memories, where millions of DNN weights are held. The use of emerging Non-Volatile Memories (NVM) further exposes DNN accelerators to a non-negligible rate of permanent defects due to immature fabrication, limited endurance, and aging. To tolerate defects in NVM-based DNN accelerators, previous work either requires extra redundancy in hardware or performs defect-aware retraining, imposing significant overhead. In comparison, this paper proposes a set of algorithms that exploit the flexibility in setting the fault-free bits in weight memory to effectively approximate weight values, so as to mitigate defect-induced accuracy drop. These algorithms can be applied as a one-step solution when loading the weights to embedded devices. They only require trivial hardware support and impose negligible run-time overhead. Experiments on popular DNN models show that the proposed techniques successfully boost inference accuracy even in the face of elevated defect rates in the weight memory.

Funder

Semiconductor Research Corporation

National Science Foundation

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Software

Link

https://dl.acm.org/doi/pdf/10.1145/3477016

Reference52 articles.

1. Nanoscale Hafnium Oxide RRAM Devices Exhibit Pulse Dependent Behavior and Multi-level Resistance Capability

2. Experimental Demonstration and Tolerancing of a Large-Scale Neural Network (165 000 Synapses) Using Phase-Change Memory as the Synaptic Weight Element

3. RRAM Defect Modeling and Failure Analysis Based on March Test and a Novel Squeeze-Search Scheme

4. F. Chollet et al. 2015. Keras. https://keras.io. (2015). F. Chollet et al. 2015. Keras. https://keras.io. (2015).

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. High Energy-Efficient Approximate In-SRAM Computing with Bit-Wise Compressor Configuration and Data-Aware Weight Remapping Method for Neural Network Acceleration;2023 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA);2023-10-27

2. Fuse Devices for Pruning in Memristive Neural Network;IEEE Electron Device Letters;2023-03

3. High-Performance Reconfigurable DNN Accelerator on a Bandwidth-limited Embedded System;ACM Transactions on Embedded Computing Systems;2022-05-02

4. Fault-Tolerant Deep Neural Networks for Processing-In-Memory based Autonomous Edge Systems;2022 Design, Automation & Test in Europe Conference & Exhibition (DATE);2022-03-14

5. Write Variation & Reliability Error Compensation by Layer-wise Tunable Retraining of Edge FeFET LM-GA CiM;IEICE Transactions on Electronics;2022