Affiliation:
1. The University of Texas at Austin
Abstract
We present a parallel sparse direct solver for multicore architectures based on Directed Acyclic Graph (DAG) scheduling. Recently, DAG scheduling has become popular in advanced Dense Linear Algebra libraries due to its efficient asynchronous parallel execution of tasks. However, its application to sparse matrix problems is more challenging as it has to deal with an enormous number of highly irregular tasks. This typically results in substantial scheduling overhead both in time and space, which causes overall parallel performance to be suboptimal. We describe a parallel solver based on two-level task parallelism: tasks are first generated from a parallel tree traversal on the assembly tree; next, those tasks are further refined by using
algorithms
-
by
-
blocks
to gain fine-grained parallelism. The resulting fine-grained tasks are asynchronously executed after their dependencies are analyzed. Our approach is distinct from others in that we adopt two-level task scheduling to mirror the two-level parallelism. As a result, we reduce scheduling overhead, and increase efficiency and flexibility. The proposed parallel sparse direct solver is evaluated for the particular problems arising from the
hp
-Finite Element Method where conventional sparse direct solvers do not scale well.
Funder
Advanced Cyberinfrastructure
Publisher
Association for Computing Machinery (ACM)
Subject
Applied Mathematics,Software
Cited by
22 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. A study of concurrent multi-frontal solvers for modern massively parallel architectures;Journal of Computational Science;2024-01
2. Algorithms for tree-shaped task partition and allocation on heterogeneous multiprocessors;The Journal of Supercomputing;2023-03-22
3. Parallel finite element solver PARFES for the structural analysis in NUMA architecture;Advances in Engineering Software;2022-12
4. Task Tree Partition and Subtree Allocation for Heterogeneous Multiprocessors;2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom);2021-09
5. A task-based distributed parallel sparsified nested dissection algorithm;Proceedings of the Platform for Advanced Scientific Computing Conference;2021-07-05