A Fully Pipelined FPGA Architecture of a Factored Restricted Boltzmann Machine Artificial Neural Network

Author:

Kim Lok-Won1,Asaad Sameh2,Linsker Ralph2

Affiliation:

1. Cisco Systems

2. IBM T. J. Watson Research Center

Abstract

Artificial neural networks (ANNs) are a natural target for hardware acceleration by FPGAs and GPGPUs because commercial-scale applications can require days to weeks to train using CPUs, and the algorithms are highly parallelizable. Previous work on FPGAs has shown how hardware parallelism can be used to accelerate a “Restricted Boltzmann Machine” (RBM) ANN algorithm, and how to distribute computation across multiple FPGAs. Here we describe a fully pipelined parallel architecture that exploits “mini-batch” training (combining many input cases to compute each set of weight updates) to further accelerate ANN training. We implement on an FPGA, for the first time to our knowledge, a more powerful variant of the basic RBM, the “Factored RBM” (fRBM). The fRBM has proved valuable in learning transformations and in discovering features that are present across multiple types of input. We obtain (in simulation) a 100-fold acceleration (vs. CPU software) for an fRBM having N = 256 units in each of its four groups (two input, one output, one intermediate group of units) running on a Virtex-6 LX760 FPGA. Many of the architectural features we implement are applicable not only to fRBMs, but to basic RBMs and other ANN algorithms more broadly.

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Cited by 26 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Review of Intelligent Detection Technologies Based on 5G Networks;Proceedings of the 2023 4th International Conference on Computer Science and Management Technology;2023-10-13

2. VLSI Implementation of Neural Systems;Advances in Systems Analysis, Software Engineering, and High Performance Computing;2023-06-16

3. FPGA-based implementation of deep neural network using stochastic computing;Applied Soft Computing;2023-04

4. Ising machines as hardware solvers of combinatorial optimization problems;Nature Reviews Physics;2022-05-04

5. Logically synthesized and hardware-accelerated restricted Boltzmann machines for combinatorial optimization and integer factorization;Nature Electronics;2022-02-28

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3