Enabling unstructured-mesh computation on massively tiled AI processors: An example of accelerating in silico cardiac simulation-Reference-Cited by-同舟云学术

Enabling unstructured-mesh computation on massively tiled AI processors: An example of accelerating in silico cardiac simulation

Published:2023-03-30 Issue: Volume:11 Page:
ISSN:2296-424X
Container-title:Frontiers in Physics
language:
Short-container-title:Front. Phys.

Author:

Burchard Luk,Hustad Kristian Gregorius,Langguth Johannes,Cai Xing

Abstract

A new trend in processor architecture design is the packaging of thousands of small processor cores into a single device, where there is no device-level shared memory but each core has its own local memory. Thus, both the work and data of an application code need to be carefully distributed among the small cores, also termed as tiles. In this paper, we investigate how numerical computations that involve unstructured meshes can be efficiently parallelized and executed on a massively tiled architecture. Graphcore IPUs are chosen as the target hardware platform, to which we port an existing monodomain solver that simulates cardiac electrophysiology over realistic 3D irregular heart geometries. There are two computational kernels in this simulator, where a 3D diffusion equation is discretized over an unstructured mesh and numerically approximated by repeatedly executing sparse matrix-vector multiplications (SpMVs), whereas an individual system of ordinary differential equations (ODEs) is explicitly integrated per mesh cell. We demonstrate how a new style of programming that uses Poplar/C++ can be used to port these commonly encountered computational tasks to Graphcore IPUs. In particular, we describe a per-tile data structure that is adapted to facilitate the inter-tile data exchange needed for parallelizing the SpMVs. We also study the achievable performance of the ODE solver that heavily depends on special mathematical functions, as well as their accuracy on Graphcore IPUs. Moreover, topics related to using multiple IPUs and performance analysis are addressed. In addition to demonstrating an impressive level of performance that can be achieved by IPUs for monodomain simulation, we also provide a discussion on the generic theme of parallelizing and executing unstructured-mesh multiphysics computations on massively tiled hardware.

Publisher

Frontiers Media SA

Subject

Physical and Theoretical Chemistry,General Physics and Astronomy,Mathematical Physics,Materials Science (miscellaneous),Biophysics

Reference35 articles.

1. Chapter 6–mesh generation;Bern,2000

2. Trends in data locality abstractions for HPC systems;Unat;IEEE Trans Parallel Distributed Syst,2017

3. Recent advances in graph partitioning. Algorithm engineering: Selected Results and surveys (springer);Buluç;Lecture Notes Comp Sci,2016

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Space Efficient Sequence Alignment for SRAM-Based Computing: X-Drop on the Graphcore IPU;Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis;2023-11-11