Affiliation:
1. College of Electronic and Information Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
2. Beijing Aerospace Automatic Control Institute, Beijing 100854, China
Abstract
Due to the advantages of parallel architecture and low power consumption, a field-programmable gate array (FPGA) is typically utilized as the hardware for convolutional neural network (CNN) accelerators. However, SRAM-based FPGA devices are extremely susceptible to single-event upsets (SEUs) induced by space radiation. In this paper, a fault tolerance analysis and fault injection experiments are applied to a CNN accelerator, and the overall results show that SEUs occurring in a control unit (CTRL) lead to the highest system error rate, which is over 70%. After that, a hybrid hardening strategy consisting of a finite state machine error-correcting circuit (FSM-ECC) and a triple modular redundancy automatic hardening technique (TMR-AHT) is proposed in this paper to achieve a tradeoff between radiation reliability and design overhead. Moreover, the proposed methodology has very small workload and good migration ability. Finally, by full exploiting the fault tolerance property of CNNs, a highly reliable CNN accelerator with the proposed hybrid hardening strategy is implemented with Xilinx Zynq-7035. When BER is 2 × 10−6, the proposed hybrid hardening strategy reduces the whole system error rate by 78.95% with the overhead of an extra 20.7% of look-up tables (LUTs) and 20.9% of flip-flops (FFs).
Funder
National Defense Science and Technology Key Laboratory