Generating Fine-Grain Multithreaded Applications Using a Multigrain Approach-Reference-Cited by-同舟云学术

Generating Fine-Grain Multithreaded Applications Using a Multigrain Approach

Published:2017-12-20 Issue:4 Volume:14 Page:1-26
ISSN:1544-3566
Container-title:ACM Transactions on Architecture and Code Optimization
language:en
Short-container-title:ACM Trans. Archit. Code Optim.

Author:

Arteaga Jaime¹,Zuckerman Stéphane²,Gao Guang R.¹

Affiliation:

1. University of Delaware, Newark, DE, USA

2. Michigan Technological University, Houghton, MI, USA

Abstract

The recent evolution in hardware landscape, aimed at producing high-performance computing systems capable of reaching extreme-scale performance, has reignited the interest in fine-grain multithreading, particularly at the intranode level. Indeed, popular parallel programming environments, such as OpenMP, which features a simple interface for the parallelization of programs, are now incorporating fine-grain constructs. However, since coarse-grain directives are still heavily used, the OpenMP runtime is forced to support both coarse- and fine-grain models of execution, potentially reducing the advantages obtained when executing an application in a fully fine-grain environment. To evaluate the type of applications that benefit from executing in a unified fine-grain program execution model, this article presents a multigrain parallel programming environment for the generation of fine-grain multithreaded applications from programs featuring OpenMP’s API, allowing OpenMP programs to be run on top of a fine-grain event-driven program execution model. Experimental results with five scientific benchmarks show that fine-grain applications, generated by and run on our environment with two runtimes implementing a fine-grain event-driven program execution model, are competitive and can outperform their OpenMP counterparts, especially for data-intensive workloads with irregular and dynamic parallelism, reaching speedups as high as 2.6× for Graph500 and 51× for NAS Data Cube.

Funder

National Science Foundation

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Information Systems,Software

Link

https://dl.acm.org/doi/pdf/10.1145/3155288

Reference25 articles.

1. libKOMP, an Efficient OpenMP Runtime System for Both Fork-Join and Data Flow Paradigms

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Profile-Based AI-Assisted Dynamic Scheduling Approach for Heterogeneous Architectures;International Journal of Parallel Programming;2021-08-23

2. CODIR: Towards an MLIR Codelet Model Dialect;2020 IEEE/ACM Fourth Annual Workshop on Emerging Parallel and Distributed Runtime Systems and Middleware (IPDRM);2020-11

3. DEMAC: A Modular Platform for HW-SW Co-Design;2020 IEEE/ACM Fourth Annual Workshop on Emerging Parallel and Distributed Runtime Systems and Middleware (IPDRM);2020-11

4. PDAWL: Profile-Based Iterative Dynamic Adaptive WorkLoad Balance on Heterogeneous Architectures;Job Scheduling Strategies for Parallel Processing;2020