Hardware Acceleration of High-Performance Computational Flow Dynamics Using High-Bandwidth Memory-Enabled Field-Programmable Gate Arrays-Reference-Cited by-同舟云学术

Hardware Acceleration of High-Performance Computational Flow Dynamics Using High-Bandwidth Memory-Enabled Field-Programmable Gate Arrays

Published:2022-06-30 Issue:2 Volume:15 Page:1-35
ISSN:1936-7406
Container-title:ACM Transactions on Reconfigurable Technology and Systems
language:en
Short-container-title:ACM Trans. Reconfigurable Technol. Syst.

Author:

Hogervorst Tom¹,Nane Răzvan¹^ORCID,Marchiori Giacomo²,Qiu Tong Dong²,Blatt Markus³,Rustad Alf Birger⁴

Affiliation:

1. Delft University of Technology, Delft, The Netherlands

2. Big Data Accelerate B.V., Delft, The Netherlands

3. OPM-OS AS, Oslo, Norway

4. Equinor S.A., Rotvoll, Norway

Abstract

Scientific computing is at the core of many High-Performance Computing applications, including computational flow dynamics. Because of the utmost importance to simulate increasingly larger computational models, hardware acceleration is receiving increased attention due to its potential to maximize the performance of scientific computing. Field-Programmable Gate Arrays could accelerate scientific computing because of the possibility to fully customize the memory hierarchy important in irregular applications such as iterative linear solvers. In this article, we study the potential of using Field-Programmable Gate Arrays in High-Performance Computing because of the rapid advances in reconfigurable hardware, such as the increase in on-chip memory size, increasing number of logic cells, and the integration of High-Bandwidth Memories on board. To perform this study, we propose a novel Sparse Matrix-Vector multiplication unit and an ILU0 preconditioner tightly integrated with a BiCGStab solver kernel. We integrate the developed preconditioned iterative solver in Flow from the Open Porous Media project, a state-of-the-art open source reservoir simulator. Finally, we perform a thorough evaluation of the FPGA solver kernel in both stand-alone mode and integrated in the reservoir simulator, using the NORNE field, a real-world case reservoir model using a grid with more than 10 5 cells and using three unknowns per cell.

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/3476229

Reference36 articles.

1. NVIDIA. n.d. NVIDIA Nsight Compute Command Line Interface. Retrieved November 3 2021 from https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.

2. AMD. 2020. AMD to Acquire Xilinx. Retrieved November 3 2021 from https://www.amd.com/en/press-releases/2020-10-27-amd-to-acquire-xilinx-creating-the-industry-s-high-performance-computing.

3. The Dune framework: Basic concepts and recent developments

4. Fine-grained parallel incomplete LU factorization;Chow Edmond;SIAM Journal on Scientific Computing,2015

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Iris;Proceedings of the 28th Asia and South Pacific Design Automation Conference;2023-01-16