multiPULPly

Author:

Eliahu Adi1,Ronen Ronny1,Gaillardon Pierre-Emmanuel2,Kvatinsky Shahar1

Affiliation:

1. Technion-Israel Institute of Technology, Haifa, Israel

2. University of Utah, Salt Lake City, Utah

Abstract

Computationally intensive neural network applications often need to run on resource-limited low-power devices. Numerous hardware accelerators have been developed to speed up the performance of neural network applications and reduce power consumption; however, most focus on data centers and full-fledged systems. Acceleration in ultra-low-power systems has been only partially addressed. In this article, we present multiPULPly, an accelerator that integrates memristive technologies within standard low-power CMOS technology, to accelerate multiplication in neural network inference on ultra-low-power systems. This accelerator was designated for PULP, an open-source microcontroller system that uses low-power RISC-V processors. Memristors were integrated into the accelerator to enable power consumption only when the memory is active, to continue the task with no context-restoring overhead, and to enable highly parallel analog multiplication. To reduce the energy consumption, we propose novel dataflows that handle common multiplication scenarios and are tailored for our architecture. The accelerator was tested on FPGA and achieved a peak energy efficiency of 19.5 TOPS/W, outperforming state-of-the-art accelerators by 1.5× to 4.5×.

Publisher

Association for Computing Machinery (ACM)

Subject

Electrical and Electronic Engineering,Hardware and Architecture,Software

Reference82 articles.

1. GAP9. 2021. Retrieved from https://greenwaves-technologies.com/gap9_iot_application_processor. GAP9. 2021. Retrieved from https://greenwaves-technologies.com/gap9_iot_application_processor.

2. Pulp Platform Website. 2021. Retrieved from https://www.pulp-platform.org. Pulp Platform Website. 2021. Retrieved from https://www.pulp-platform.org.

3. YAML. 2011. Retrieved from https://yaml.org. YAML. 2011. Retrieved from https://yaml.org.

4. stm32h743 datasheet.2019. https://www.st.com/resource/en/datasheet/stm32l476je.pdf. stm32h743 datasheet.2019. https://www.st.com/resource/en/datasheet/stm32l476je.pdf.

5. Resistive Random Access Memory (ReRAM) Based on Metal Oxides

Cited by 4 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Accelerators in Embedded Systems for Machine Learning: A RISCV View;2023 38th Conference on Design of Circuits and Integrated Systems (DCIS);2023-11-15

2. A CNN Hardware Accelerator Using Triangle-based Convolution;ACM Journal on Emerging Technologies in Computing Systems;2022-10-13

3. A review of CNN accelerators for embedded systems based on RISC-V;2022 IEEE International Conference on Omni-layer Intelligent Systems (COINS);2022-08-01

4. A Parallel SystemC Virtual Platform for Neuromorphic Architectures;2022 23rd International Symposium on Quality Electronic Design (ISQED);2022-04-06

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3