Parallel computation to bidimensional heat equation using MPI/CUDA and FFTW package-Reference-Cited by-同舟云学术

Parallel computation to bidimensional heat equation using MPI/CUDA and FFTW package

Published:2024-01-11 Issue: Volume:5 Page:
ISSN:2624-9898
Container-title:Frontiers in Computer Science
language:
Short-container-title:Front. Comput. Sci.

Author:

Chakkour Tarik

Abstract

In this study, we present a fast algorithm for the numerical solution of the heat equation. The heat equation models the heat diffusion over time and through a given region. We engage a finite difference method to solve this equation numerically. The performance of its parallel implementation is considered using Message Passing Interface (MPI), Compute Unified Device Architecture (CUDA), and time schemes, such as Forward Euler (FE) and Runge-Kutta (RK) methods. The originality of this study is research on parallel implementations of the fourth-order Runge-Kutta method (RK4) for sparse matrices on Graphics Processing Unit (GPU) architecture. The supreme proprietary framework for GPU computing is CUDA, provided by NVIDIA. We will show three metrics through this parallelization to compare the computing performance: time-to-solution, speed-up, and performance. The spectral method is investigated by utilizing the FFTW software library, based on the computation of the fast Fourier transforms (FFT) in parallel and distributed memory architectures. Our CUDA-based FFT, named CUFFT, is performed in platforms, which is a highly optimized FFTW implementation. We will give numerical tests to reveal that this method is up-and-coming for solving the heat equation. The final result demonstrates that CUDA has a significant advantage and performance since the computational cost is tiny compared with the MPI implementation. This vital performance gain is also achieved through careful attention of managing memory communication and access.

Publisher

Frontiers Media SA

Reference56 articles.

1. On spread option pricing using two-dimensional Fourier transform;Alfeus;Int. J. Theor. Appl. Fin,2019

2. Compression and load balancing for efficient sparse matrix-vector product on multicore processors and graphics processing units;Aliaga;Concurr. Comput,2022

3. An optimized Runge-Kutta method for the solution of orbital problems;Anastassi;J. Comput. Appl. Math,2005

4. LAPACK Users' Guide

5. “Multicore embedded worst-case task design issues and analysis using machine learning logic,”;Aradhya;IOT with Smart Systems: Proceedings of ICTIS 2021, Vol. 2,2022

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. High-quality implementation for a continuous-in-time financial API in C#;Frontiers in Computer Science;2024-07-25

2. Parallel Numerical Solution of 2D Electrostatics Poisson Equation on Different Mesh Partitioning Schemes;VFAST Transactions on Mathematics;2024-06-30