A GPU implementation of inclusion-based points-to analysis-Reference-Cited by-同舟云学术

A GPU implementation of inclusion-based points-to analysis

Published:2012-09-11 Issue:8 Volume:47 Page:107-116
ISSN:0362-1340
Container-title:ACM SIGPLAN Notices
language:en
Short-container-title:SIGPLAN Not.

Author:

Mendez-Lojo Mario¹,Burtscher Martin²,Pingali Keshav¹

Affiliation:

1. University of Texas, Austin, TX, USA

2. Texas State University, San Marcos, TX, USA

Abstract

Graphics Processing Units (GPUs) have emerged as powerful accelerators for many regular algorithms that operate on dense arrays and matrices. In contrast, we know relatively little about using GPUs to accelerate highly irregular algorithms that operate on pointer-based data structures such as graphs. For the most part, research has focused on GPU implementations of graph analysis algorithms that do not modify the structure of the graph, such as algorithms for breadth-first search and strongly-connected components. In this paper, we describe a high-performance GPU implementation of an important graph algorithm used in compilers such as gcc and LLVM: Andersen-style inclusion-based points-to analysis. This algorithm is challenging to parallelize effectively on GPUs because it makes extensive modifications to the structure of the underlying graph and performs relatively little computation. In spite of this, our program, when executed on a 14 Streaming Multiprocessor GPU, achieves an average speedup of 7x compared to a sequential CPU implementation and outperforms a parallel implementation of the same algorithm running on 16 CPU cores. Our implementation provides general insights into how to produce high-performance GPU implementations of graph algorithms, and it highlights key differences between optimizing parallel programs for multicore CPUs and for GPUs.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Graphics and Computer-Aided Design,Software

Link

https://dl.acm.org/doi/pdf/10.1145/2370036.2145831

Reference34 articles.

1. NVIDIA's Next Generation CUDA Compute Architecture: Fermi. http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf 2010. NVIDIA's Next Generation CUDA Compute Architecture: Fermi. http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf 2010.

2. CUDA C Programming Guide 4.0. NVIDIA 2011. CUDA C Programming Guide 4.0. NVIDIA 2011.

3. Designing Multithreaded Algorithms for Breadth-First Search and st-connectivity on the Cray MTA-2

4. Computing Strongly Connected Components in Parallel on CUDA

5. Points-to analysis using BDDs

Cited by 53 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Octopus : Scaling Value-Flow Analysis via Parallel Collection of Realizable Path Conditions;ACM Transactions on Software Engineering and Methodology;2024-03-29

2. Instruction Scheduling for the GPU on the GPU;2024 IEEE/ACM International Symposium on Code Generation and Optimization (CGO);2024-03-02

3. MIMD Programs Execution Support on SIMD Machines: A Holistic Survey;IEEE Access;2024

4. P-DATA: A Task-Level Parallel Framework for Dependency-Aware Value Flow Taint Analysis;2023 30th Asia-Pacific Software Engineering Conference (APSEC);2023-12-04

5. A Parallel Memory Defect Detection Method based on Sparse-Value-Flow Graph;2023 IEEE International Conference on Joint Cloud Computing (JCC);2023-07