Software/Hardware Co-design of 3D NoC-based GPU Architectures for Accelerated Graph Computations-Reference-Cited by-同舟云学术

Software/Hardware Co-design of 3D NoC-based GPU Architectures for Accelerated Graph Computations

Published:2022-06-27 Issue:6 Volume:27 Page:1-22
ISSN:1084-4309
Container-title:ACM Transactions on Design Automation of Electronic Systems
language:en
Short-container-title:ACM Trans. Des. Autom. Electron. Syst.

Author:

Choudhury Dwaipayan¹,Barik Reet¹^ORCID,Rajam Aravind Sukumaran¹^ORCID,Kalyanaraman Ananth¹^ORCID,Pande Partha Pratim¹^ORCID

Affiliation:

1. Washington State University, Pullman, WA

Abstract

Manycore GPU architectures have become the mainstay for accelerating graph computations. One of the primary bottlenecks to performance of graph computations on manycore architectures is the data movement. Since most of the accesses in graph processing are due to vertex neighborhood lookups, locality in graph data structures plays a key role in dictating the degree of data movement. Vertex reordering is a widely used technique to improve data locality within graph data structures. However, these reordering schemes alone are not sufficient as they need to be complemented with efficient task allocation on manycore GPU architectures to reduce latency due to local cache misses. Consequently, in this article, we introduce a software/hardware co-design framework for accelerating graph computations. Our approach couples an architecture-aware vertex reordering with a priority-based task allocation technique. As the task allocation aims to reduce on-chip latency and associated energy, the choice of Network-on-Chip (NoC) as the communication backbone in the manycore platform is an important parameter. By leveraging emerging three-dimensional (3D) integration technology, we propose design of a small-world NoC (SWNoC)-enabled manycore GPU architecture, where the placement of the links connecting the streaming multiprocessors (SMs) and the memory controllers (MCs) follow a power-law distribution. The proposed 3D SWNoC-enabled software/hardware co-design framework achieves 11.1% to 22.9% performance improvement and 16.4% to 32.6% less energy consumption depending on the dataset and the graph application, when compared to the default order of dataset running on a conventional planar mesh architecture.

Funder

US National Science Foundation

Publisher

Association for Computing Machinery (ACM)

Subject

Electrical and Electronic Engineering,Computer Graphics and Computer-Aided Design,Computer Science Applications

Link

https://dl.acm.org/doi/pdf/10.1145/3514354

Reference43 articles.

1. High-performance and energy-efficient network-on-chip architectures for graph analytics;Duraisamy K.;ACM Transactions on Embedded Computing Systems,2016

2. Gunrock

3. GPUWattch

4. Parallel graph analytics;Lenharth A.;Communications of the,2016

5. Centaur: Hybrid Processing in On/Off-chip Memory Architecture for Graph Analytics

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Load Balanced PIM-Based Graph Processing;ACM Transactions on Design Automation of Electronic Systems;2024-06-21

2. A survey of machine learning for Network-on-Chips;Journal of Parallel and Distributed Computing;2024-04

3. Enabling Neuromorphic Computing for Artificial Intelligence with Hardware-Software Co-Design;Neuromorphic Computing;2023-11-15

4. Accelerating Graph Computations on 3D NoC-Enabled PIM Architectures;ACM Transactions on Design Automation of Electronic Systems;2023-03-19