Gradient Decomposition Methods for Training Neural Networks With Non-ideal Synaptic Devices

Author:

Zhao Junyun,Huang Siyuan,Yousuf Osama,Gao Yutong,Hoskins Brian D.,Adam Gina C.

Abstract

While promising for high-capacity machine learning accelerators, memristor devices have non-idealities that prevent software-equivalent accuracies when used for online training. This work uses a combination of Mini-Batch Gradient Descent (MBGD) to average gradients, stochastic rounding to avoid vanishing weight updates, and decomposition methods to keep the memory overhead low during mini-batch training. Since the weight update has to be transferred to the memristor matrices efficiently, we also investigate the impact of reconstructing the gradient matrixes both internally (rank-seq) and externally (rank-sum) to the memristor array. Our results show that streaming batch principal component analysis (streaming batch PCA) and non-negative matrix factorization (NMF) decomposition algorithms can achieve near MBGD accuracy in a memristor-based multi-layer perceptron trained on the MNIST (Modified National Institute of Standards and Technology) database with only 3 to 10 ranks at significant memory savings. Moreover, NMF rank-seq outperforms streaming batch PCA rank-seq at low-ranks making it more suitable for hardware implementation in future memristor-based accelerators.

Funder

Office of Naval Research

George Washington University

National Institute of Standards and Technology

Publisher

Frontiers Media SA

Subject

General Neuroscience

Reference65 articles.

1. Challenges hindering memristive neuromorphic hardware from going mainstream.;Adam;Nat. Commun.,2018

2. Equivalent-accuracy accelerated neural-network training using analogue memory.;Ambrogio;Nature,2018

3. Switching phenomena in titanium oxide thin films.;Argall;Solid State Electron.,1968

4. Highly scalable nonvolatile resistive memory using simple binary oxide driven by asymmetric unipolar voltage pulses;Baek;Proceedings of the IEDM Technical Digest. IEEE International Electron Devices Meeting, 2004,2004

5. An electronic digital computor using cold cathode counting tubes for storage.;Barnes;Electron. Eng.,1951

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Neural Network Modeling Bias for Hafnia-based FeFETs;Proceedings of the 18th ACM International Symposium on Nanoscale Architectures;2023-12-18

2. Device Modeling Bias in ReRAM-Based Neural Network Simulations;IEEE Journal on Emerging and Selected Topics in Circuits and Systems;2023-03

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3