Flattening and parallelizing irregular, recurrent loop nests-Reference-Cited by-同舟云学术

Flattening and parallelizing irregular, recurrent loop nests

Published:1995-08 Issue:8 Volume:30 Page:58-67
ISSN:0362-1340
Container-title:ACM SIGPLAN Notices
language:en
Short-container-title:SIGPLAN Not.

Author:

Ghuloum Anwar M.¹,Fisher Allan L.¹

Affiliation:

1. School of Computer Science, Carnegie Mellon University, 5000 Forbes Ave., Pittsburgh, PA

Abstract

Irregular loop nests in which the loop bounds are determined dynamically by indexed arrays are difficult to compile into expressive parallel constructs, such as segmented scans and reductions. In this paper, we describe a suite of transformations to automatically parallelize such irregular loop nests, even in the presence of recurrences. We describe a simple, general loop flattening transformation, along with new optimizations which make it a viable compiler transformation. A robust recurrence parallelization technique is coupled to the loop flattening transformation, allowing parallelization of segmented reductions, scans, and combining-sends over arbitrary associative operators. We discuss the implementation and performance results of the transformations in a parallelizing Fortran 77 compiler for the Cray C90 supercomputer. In particular, we focus on important sparse matrix-vector multiplication kernels, for one of which we are able to automatically derive an algorithm used by one of the fastest library routines available.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Graphics and Computer-Aided Design,Software

Link

https://dl.acm.org/doi/pdf/10.1145/209937.209944

Reference14 articles.

1. Scans as primitive parallel operations

2. Implementation of a portable nested data-parallel language

3. Solving linear recurrences with loop raking

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Batchman and Robin: Batched and Non-batched Branching for Interactive ZK;Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security;2023-11-15

2. Beacons: An End-to-End Compiler Framework for Predicting and Utilizing Dynamic Loop Characteristics;Proceedings of the ACM on Programming Languages;2023-10-16

3. Source code transformations and optimizations;Embedded Computing for High Performance;2017

4. Efficient RAM and Control Flow in Verifiable Outsourced Computation;Proceedings 2015 Network and Distributed System Security Symposium;2015