Dual tree traversal on integrated GPUs for astrophysical N-body simulations


Fortin Pierre12,Touche Maxime1


1. Sorbonne Université, UPMC Univ Paris 06, CNRS, Laboratoire d’Informatique de Paris 6 (LIP6) UMR 7606, Paris, France

2. University of Lille, CNRS, Centrale Lille, CRIStAL UMR 9189, Lille, France


In astrophysical N-body simulations, O( N) fast multipole methods (FMMs) with dual tree traversal (DTT) on multi-core CPUs are faster than O( N log N) CPU tree-codes but can still be outperformed by GPU ones. In this article, we aim at combining the best algorithm, namely FMM with DTT, with the most powerful hardware currently available, namely GPUs. In the astrophysical context requiring low accuracies and non-uniform particle distributions, we show that such combination can be achieved thanks to a hybrid CPU-GPU algorithm on integrated GPUs: while the DTT is performed on the CPU cores, the far- and near-field computations are all performed on the GPU cores. We show how to efficiently expose the interactions resulting from the DTT to the GPU cores, how to deploy both the far- and near-field computations on GPU, and how to overlap the parallel DTT on CPU with GPU computations. Based on the falcON code and using OpenCL on AMD Accelerated Processing Units and on Intel integrated GPUs, this first heterogeneous deployment of DTT for FMM outperforms standard multi-core CPUs and matches GPU and high-end CPU performance, being hence more cost- and power-efficient.


SAGE Publications


Hardware and Architecture,Theoretical Computer Science,Software

Cited by 4 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. N-Body Simulation Inspired by Metaheuristics Optimization;Computer Systems Science and Engineering;2022

2. Adaptive tiling for parallel N-body simulations on many core;Astronomy and Computing;2021-07

3. Fast Multipole Methods for N-body Simulations of Collisional Star Systems;The Astrophysical Journal;2021-07-01

4. A GPU-Accelerated Barycentric Lagrange Treecode;2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW);2020-05








Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3