ACE-GCN: A Fast Data-driven FPGA Accelerator for GCN Embedding-Reference-Cited by-同舟云学术

ACE-GCN: A Fast Data-driven FPGA Accelerator for GCN Embedding

Published:2021-12-31 Issue:4 Volume:14 Page:1-23
ISSN:1936-7406
Container-title:ACM Transactions on Reconfigurable Technology and Systems
language:en
Short-container-title:ACM Trans. Reconfigurable Technol. Syst.

Author:

Hung José Romero¹,Li Chao¹^ORCID,Wang Pengyu¹,Shao Chuanming¹,Guo Jinyang¹,Wang Jing¹,Shi Guoyong¹

Affiliation:

1. Shanghai Jiao Tong University, Shanghai, China

Abstract

ACE-GCN is a fast and resource/energy-efficient FPGA accelerator for graph convolutional embedding under data-driven and in-place processing conditions. Our accelerator exploits the inherent power law distribution and high sparsity commonly exhibited by real-world graphs datasets. Contrary to other hardware implementations of GCN, on which traditional optimization techniques are employed to bypass the problem of dataset sparsity, our architecture is designed to take advantage of this very same situation. We propose and implement an innovative acceleration approach supported by our “implicit-processing-by-association” concept, in conjunction with a dataset-customized convolutional operator. The computational relief and consequential acceleration effect arise from the possibility of replacing rather complex convolutional operations for a faster embedding result estimation. Based on a computationally inexpensive and super-expedited similarity calculation, our accelerator is able to decide from the automatic embedding estimation or the unavoidable direct convolution operation. Evaluations demonstrate that our approach presents excellent applicability and competitive acceleration value. Depending on the dataset and efficiency level at the target, between 23× and 4,930× PyG baseline, coming close to AWB-GCN by 46% to 81% on smaller datasets and noticeable surpassing AWB-GCN for larger datasets and with controllable accuracy loss levels. We further demonstrate the unique hardware optimization characteristics of our approach and discuss its multi-processing potentiality.

Funder

National Key Research & Development Program of China

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/3470536

Reference42 articles.

1. Data-Driven Rule Mining and Representation of Temporal Patterns in Physiological Sensor Data

2. Diana Cai Trevor Campbell and T. Broderick. 2016. Edge-exchangeable graphs and sparsity (NIPS 2016). arXiv: Machine Learning (2016). Diana Cai Trevor Campbell and T. Broderick. 2016. Edge-exchangeable graphs and sparsity (NIPS 2016). arXiv: Machine Learning (2016).

3. Sensor-Based Activity Recognition

4. An FPGA-based hardware accelerator for CNNs using on-chip memories only: Design and benchmarking with Intel Movidius neural compute stick;Dinelli Gianmarco;International Journal of Reconfigurable Computing,2019

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Survey of Computationally Efficient Graph Neural Networks for Reconfigurable Systems;Information;2024-06-28

2. Pedestrian Trajectory Prediction Based on Social Interactions Learning With Random Weights;IEEE Transactions on Multimedia;2024

3. DRAGON: Dynamic Recurrent Accelerator for Graph Online Convolution;ACM Transactions on Design Automation of Electronic Systems;2023-01-20

4. FPGA sharing in the cloud: a comprehensive analysis;Frontiers of Computer Science;2022-12-24

5. GPGCN: A General-Purpose Graph Convolution Neural Network Accelerator Based on RISC-V ISA Extension;Electronics;2022-11-21