DDAM:
D
ata
D
istribution-
A
ware
M
apping of CNNs on Processing-In-Memory Systems
-
Published:2023-03-19
Issue:3
Volume:28
Page:1-30
-
ISSN:1084-4309
-
Container-title:ACM Transactions on Design Automation of Electronic Systems
-
language:en
-
Short-container-title:ACM Trans. Des. Autom. Electron. Syst.
Author:
Wang Junpeng1ORCID,
Du Haitao1ORCID,
Ding Bo1ORCID,
Xu Qi1ORCID,
Chen Song2ORCID,
Kang Yi2ORCID
Affiliation:
1. University of Science and Technology of China, Hefei, Anhui, P.R. China
2. University of Science and Technology of China; Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, Anhui, P.R. China
Abstract
Convolution neural networks (CNNs) are widely used algorithms in image processing, natural language processing and many other fields. The large amount of memory access of CNNs is one of the major concerns in CNN accelerator designs that influences the performance and energy-efficiency. With fast and low-cost memory access, Processing-In-Memory (PIM) system is a feasible solution to alleviate the memory concern of CNNs. However, the distributed manner of data storing in PIM systems is in conflict with the large amount of data reuse of CNN layers. Nodes of PIM systems may need to share their data with each other before processing a CNN layer, leading to extra communication overhead. In this article, we propose DDAM to map CNNs onto PIM systems with the communication overhead reduced. Firstly, A data transfer strategy is proposed to deal with the data sharing requirement among PIM nodes by formulating a Traveling-Salesman-Problem (TSP). To improve data locality, a dynamic programming algorithm is proposed to partition the CNN and allocate a number of nodes to each part. Finally, an integer linear programming (ILP)-based mapping algorithm is proposed to map the partitioned CNN onto the PIM system. Experimental results show that compared to the baselines, DDAM can get a higher throughput of 2.0× with the energy cost reduced by 37% on average.
Funder
National Key R&D Program of China
National Natural Science Foundation of China
CAS Project for Young Scientists in Basic Research
Strategic Priority Research Program of Chinese Academy of Sciences
Publisher
Association for Computing Machinery (ACM)
Subject
Electrical and Electronic Engineering,Computer Graphics and Computer-Aided Design,Computer Science Applications
Reference45 articles.
1. 2018. Hybrid memory cube – HMC Gen2. (2018) 105. Retrieved from https://www.micron.com/-/media/client/global/documents/products/data-sheet/hmc/gen2/hmc_gen2.pdf. Accessed May 1 2022.
2. Fused-layer CNN accelerators
3. Irwan Bello William Fedus Xianzhi Du Ekin D. Cubuk Aravind Srinivas Tsung-Yi Lin Jonathon Shlens and Barret Zoph. 2021. Revisiting ResNets: Improved training and scaling strategies. arXiv:2103.07579. Retrieved from https://arxiv.org/abs/2103.07579.
4. Communication Lower Bound in Convolution Accelerators
5. DaDianNao: A Machine-Learning Supercomputer
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Load Balanced PIM-Based Graph Processing;ACM Transactions on Design Automation of Electronic Systems;2024-06-21
2. ILP-based Multi-Branch CNNs Mapping on Processing-in-Memory Architecture;2024 IEEE 6th International Conference on AI Circuits and Systems (AICAS);2024-04-22
3. PIM-trie: A Skew-resistant Trie for Processing-in-Memory;Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures;2023-06-17
4. NicePIM: Design Space Exploration for Processing-In-Memory DNN Accelerators With 3D-Stacked-DRAM;IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems;2023