Affiliation:
1. Imperial College London, London, United Kingdom
Abstract
Prefix sums are key building blocks in the implementation of many concurrent software applications, and recently much work has gone into efficiently implementing prefix sums to run on massively parallel graphics processing units (GPUs). Because they lie at the heart of many GPU-accelerated applications, the correctness of prefix sum implementations is of prime importance.
We introduce a novel abstraction, the interval of summations, that allows scalable reasoning about implementations of prefix sums. We present this abstraction as a monoid, and prove a soundness and completeness result showing that a generic sequential prefix sum implementation is correct for an array of length $n$ if and only if it computes the correct result for a specific test case when instantiated with the interval of summations monoid. This allows correctness to be established by running a single test where the input and result require O(n lg(n)) space. This improves upon an existing result by Sheeran where the input requires O(n lg(n)) space and the result O(n
2
\lg(n)) space, and is more feasible for large
n
than a method by Voigtlaender that uses O(n) space for the input and result but requires running O(n
2
) tests. We then extend our abstraction and results to the context of data-parallel programs, developing an automated verification method for GPU implementations of prefix sums. Our method uses static verification to prove that a generic prefix sum implementation is data race-free, after which functional correctness of the implementation can be determined by running a single test case under the interval of summations abstraction.
We present an experimental evaluation using four different prefix sum algorithms, showing that our method is highly automatic, scales to large thread counts, and significantly outperforms Voigtlaender's method when applied to large arrays.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design,Software
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. AuDaLa is Turing Complete;Lecture Notes in Computer Science;2024
2. An Autonomous Data Language;Theoretical Aspects of Computing – ICTAC 2023;2023
3. Formal Verification of Parallel Prefix Sum;Lecture Notes in Computer Science;2020
4. Formal Verification of Parallel Stream Compaction and Summed-Area Table Algorithms;Theoretical Aspects of Computing – ICTAC 2020;2020
5. HLS-based optimization and design space exploration for applications with variable loop bounds;Proceedings of the International Conference on Computer-Aided Design;2018-11-05