PARTANS-Reference-Cited by-同舟云学术

PARTANS

Published:2013-01 Issue:4 Volume:9 Page:1-24
ISSN:1544-3566
Container-title:ACM Transactions on Architecture and Code Optimization
language:en
Short-container-title:ACM Trans. Archit. Code Optim.

Author:

Lutz Thibaut¹,Fensch Christian¹,Cole Murray¹

Affiliation:

1. University of Edinburgh, Edinburgh, United Kingdom

Abstract

GPGPUs are a powerful and energy-efficient solution for many problems. For higher performance or larger problems, it is necessary to distribute the problem across multiple GPUs, increasing the already high programming complexity. In this article, we focus on abstracting the complexity of multi-GPU programming for stencil computation. We show that the best strategy depends not only on the stencil operator, problem size, and GPU, but also on the PCI express layout. This adds nonuniform characteristics to a seemingly homogeneous setup, causing up to 23% performance loss. We address this issue with an autotuner that optimizes the distribution across multiple GPUs.

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Information Systems,Software

Link

https://dl.acm.org/doi/pdf/10.1145/2400682.2400718

Reference22 articles.

1. AMD. Accelerated parallel processing (APP) SDK (formerly ATI stream). http://developer.amd.com/appsdk AMD. Accelerated parallel processing (APP) SDK (formerly ATI stream). http://developer.amd.com/appsdk

2. Applied Numerical Algorithms Group LBNL. CHOMBO - Software for adaptive solutions of partial differential equations. https://commons.lbl.gov/display/chombo/ Applied Numerical Algorithms Group LBNL. CHOMBO - Software for adaptive solutions of partial differential equations. https://commons.lbl.gov/display/chombo/

3. PATUS: A Code Generation and Autotuning Framework for Parallel Iterative Stencil Computations on Modern Microarchitectures

4. Auto-tuning SkePU

Cited by 50 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Optimizing Three-Dimensional Stencil-Operations on Heterogeneous Computing Environments;International Journal of Parallel Programming;2024-06-21

2. Stencil Computation with Vector Outer Product;Proceedings of the 38th ACM International Conference on Supercomputing;2024-05-30

3. Fingerprinting and Mapping Cloud FPGA Infrastructures;Security of FPGA-Accelerated Cloud Computing Environments;2023-09-18

4. EPSILOD: efficient parallel skeleton for generic iterative stencil computations in distributed GPUs;The Journal of Supercomputing;2023-01-14

5. AlphaSparse: Generating High Performance SpMV Codes Directly from Sparse Matrices;SC22: International Conference for High Performance Computing, Networking, Storage and Analysis;2022-11