Affiliation:
1. Université de Lille, CNRS/CRIStAL UMR 9189, Centre Inria de l'Université de Lille France
2. Université du Luxembourg, FSTM Luxembourg
3. Université du Luxembourg, DCS‐FSTM/SnT Luxembourg
Abstract
AbstractWith the recent arrival of the exascale era, modern supercomputers are increasingly big making their programming much more complex. In addition to performance, software productivity is a major concern to choose a programming language, such as Chapel, designed for exascale computing. In this paper, we investigate the design of a parallel distributed tree‐search algorithm, namely P3D‐DFS, and its implementation using Chapel. The design is based on the Chapel's DistBag data structure, revisited by: (1) redefining the data structure for Depth‐First tree‐Search (DFS), henceforth renamed DistBag‐DFS; (2) redesigning the underlying load balancing mechanism. In addition, we propose two instantiations of P3D‐DFS considering the Branch‐and‐Bound (B&B) and Unbalanced Tree Search (UTS) algorithms. In order to evaluate how much performance is traded for productivity, we compare the Chapel‐based implementations of B&B and UTS to their best‐known counterparts based on traditional OpenMP (intra‐node) and MPI+X (inter‐node). For experimental validation using 4096 processing cores, we consider the permutation flow‐shop scheduling problem for B&B and synthetic literature benchmarks for UTS. The reported results show that P3D‐DFS competes with its OpenMP baselines for coarser‐grained shared‐memory scenarios, and with its MPI+X counterparts for distributed‐memory settings, considering both performance and productivity‐awareness. In the context of this work, this makes Chapel an alternative to OpenMP/MPI+X for exascale programming.
Funder
Agence Nationale de la Recherche
Fonds National de la Recherche Luxembourg
Subject
Computational Theory and Mathematics,Computer Networks and Communications,Computer Science Applications,Theoretical Computer Science,Software
Reference37 articles.
1. A view of the parallel computing landscape
2. The Chapel Parallel Programming Language.The DistributedBag module.https://chapel‐lang.org/docs/modules/packages/DistributedBag.html version 1.29.0.
3. Exactly Solving Hard Permutation Flowshop Scheduling Problems on Peta-Scale GPU-Accelerated Supercomputers
4. Parallel Branch-and-Branch Algorithms: Survey and Synthesis
5. TrienekensHW deBruinA.Towards a taxonomy of parallel branch and bound algorithms. Technical report.1992.http://hdl.handle.net/1765/1491
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. GPU-Accelerated Tree-Search in Chapel Versus CUDA and HIP;2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW);2024-05-27
2. PGAS Data Structure for Unbalanced Tree-Based Algorithms at Scale;Lecture Notes in Computer Science;2024