GPU Concurrency-Reference-Cited by-同舟云学术

GPU Concurrency

Published:2015-05-29 Issue:1 Volume:43 Page:577-591
ISSN:0163-5964
Container-title:ACM SIGARCH Computer Architecture News
language:en
Short-container-title:SIGARCH Comput. Archit. News

Author:

Alglave Jade¹,Batty Mark²,Donaldson Alastair F.³,Gopalakrishnan Ganesh⁴,Ketema Jeroen³,Poetzl Daniel⁵,Sorensen Tyler⁶,Wickerson John³

Affiliation:

1. University College London; Microsoft Research, London; Cambridge, United Kingdom

2. University of Cambridge, Cambridge, United Kingdom

3. Imperial College London, London, United Kingdom

4. University of Utah, Salt Lake City, USA

5. University of Oxford, Oxford, United Kingdom

6. University College London, London, United Kingdom

Abstract

Concurrency is pervasive and perplexing, particularly on graphics processing units (GPUs). Current specifications of languages and hardware are inconclusive; thus programmers often rely on folklore assumptions when writing software. To remedy this state of affairs, we conducted a large empirical study of the concurrent behaviour of deployed GPUs. Armed with litmus tests (i.e. short concurrent programs), we questioned the assumptions in programming guides and vendor documentation about the guarantees provided by hardware. We developed a tool to generate thousands of litmus tests and run them under stressful workloads. We observed a litany of previously elusive weak behaviours, and exposed folklore beliefs about GPU programming---often supported by official tutorials---as false. As a way forward, we propose a model of Nvidia GPU hardware, which correctly models every behaviour witnessed in our experiments. The model is a variant of SPARC Relaxed Memory Order (RMO), structured following the GPU concurrency hierarchy.

Funder

NSF CCF

SRC

EPSRC

EU FP7 project CARP

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/2786763.2694391

Reference44 articles.

1. Online companion material. http://virginia.cs.ucl.ac.uk/sunflowers/asplos15/. Online companion material. http://virginia.cs.ucl.ac.uk/sunflowers/asplos15/.

2. GPUBench June 2014. http://graphics.stanford.edu/projects/gpubench. GPUBench June 2014. http://graphics.stanford.edu/projects/gpubench.

3. A formal hierarchy of weak memory models

4. Fences in weak memory models (extended version)

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. CAAT: consistency as a theory;Proceedings of the ACM on Programming Languages;2022-10-31

2. GaccO - A GPU-accelerated OLTP DBMS;Proceedings of the 2022 International Conference on Management of Data;2022-06-10

3. The semantics of shared memory in Intel CPU/FPGA systems;Proceedings of the ACM on Programming Languages;2021-10-20

4. Systems-on-Chip with Strong Ordering;ACM Transactions on Architecture and Code Optimization;2021-01-21

5. On-GPU thread-data remapping for nested branch divergence;Journal of Parallel and Distributed Computing;2020-05