Affiliation:
1. Shanghai Jiao Tong University, Shanghai, China
Abstract
ACE-GCN is a fast and resource/energy-efficient FPGA accelerator for graph convolutional embedding under data-driven and in-place processing conditions. Our accelerator exploits the inherent power law distribution and high sparsity commonly exhibited by real-world graphs datasets. Contrary to other hardware implementations of GCN, on which traditional optimization techniques are employed to bypass the problem of dataset sparsity, our architecture is designed to take advantage of this very same situation. We propose and implement an innovative acceleration approach supported by our “implicit-processing-by-association” concept, in conjunction with a dataset-customized convolutional operator. The computational relief and consequential acceleration effect arise from the possibility of replacing rather complex convolutional operations for a faster embedding result estimation. Based on a computationally inexpensive and super-expedited similarity calculation, our accelerator is able to decide from the automatic embedding estimation or the unavoidable direct convolution operation. Evaluations demonstrate that our approach presents excellent applicability and competitive acceleration value. Depending on the dataset and efficiency level at the target, between 23× and 4,930× PyG baseline, coming close to AWB-GCN by 46% to 81% on smaller datasets and noticeable surpassing AWB-GCN for larger datasets and with controllable accuracy loss levels. We further demonstrate the unique hardware optimization characteristics of our approach and discuss its multi-processing potentiality.
Funder
National Key Research & Development Program of China
Publisher
Association for Computing Machinery (ACM)
Reference42 articles.
1. Data-Driven Rule Mining and Representation of Temporal Patterns in Physiological Sensor Data
2. Diana Cai Trevor Campbell and T. Broderick. 2016. Edge-exchangeable graphs and sparsity (NIPS 2016). arXiv: Machine Learning (2016). Diana Cai Trevor Campbell and T. Broderick. 2016. Edge-exchangeable graphs and sparsity (NIPS 2016). arXiv: Machine Learning (2016).
3. Sensor-Based Activity Recognition
4. An FPGA-based hardware accelerator for CNNs using on-chip memories only: Design and benchmarking with Intel Movidius neural compute stick;Dinelli Gianmarco;International Journal of Reconfigurable Computing,2019
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献