SynergyFlow-Reference-Cited by-同舟云学术

SynergyFlow

Published:2019-01-11 Issue:1 Volume:24 Page:1-27
ISSN:1084-4309
Container-title:ACM Transactions on Design Automation of Electronic Systems
language:en
Short-container-title:ACM Trans. Des. Autom. Electron. Syst.

Author:

Li Jiajun¹,Yan Guihai¹,Lu Wenyan¹,Gong Shijun¹,Jiang Shuhao¹,Wu Jingya¹,Li Xiaowei¹

Affiliation:

1. State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, People’s Republic of China, Beijing, China

Abstract

Neural networks (NNs) have achieved great success in a broad range of applications. As NN-based methods are often both computation and memory intensive, accelerator solutions have been proved to be highly promising in terms of both performance and energy efficiency. Although prior solutions can deliver high computational throughput for convolutional layers, they could incur severe performance degradation when accommodating the entire network model, because there exist very diverse computing and memory bandwidth requirements between convolutional layers and fully connected layers and, furthermore, among different NN models. To overcome this problem, we proposed an elastic accelerator architecture, called SynergyFlow, which intrinsically supports layer-level and model-level parallelism for large-scale deep neural networks. SynergyFlow boosts the resource utilization by exploiting the complementary effect of resource demanding in different layers and different NN models. SynergyFlow can dynamically reconfigure itself according to the workload characteristics, maintaining a high performance and high resource utilization among various models. As a case study, we implement SynergyFlow on a P395-AB FPGA board. Under 100MHz working frequency, our implementation improves the performance by 33.8% on average (up to 67.2% on AlexNet) compared to comparable provisioned previous architectures.

Funder

National Natural Science Foundation of China

Youth Innovation Promotion Association of the Chinese Academy of Sciences

Publisher

Association for Computing Machinery (ACM)

Subject

Electrical and Electronic Engineering,Computer Graphics and Computer-Aided Design,Computer Science Applications

Link

https://dl.acm.org/doi/pdf/10.1145/3275243

Reference54 articles.

1. Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing

2. Representation Learning: A Review and New Perspectives

3. A programmable parallel accelerator for learning and classification

4. A dynamically configurable coprocessor for convolutional neural networks

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Performance-Driven LSTM Accelerator Hardware Using Split-Matrix-Based MVM;Circuits, Systems, and Signal Processing;2023-06-08

2. Design Space Optimization of Shared Memory Architecture in Accelerator-rich Systems;ACM Transactions on Design Automation of Electronic Systems;2021-04

3. Modular Neural Networks for Low-Power Image Classification on Embedded Devices;ACM Transactions on Design Automation of Electronic Systems;2021-01-05