cuThomasBatch and cuThomasVBatch, CUDA Routines to compute batch of tridiagonal systems on NVIDIA GPUs

Author:

Valero-Lara Pedro1ORCID,Martínez-Pérez Ivan1,Sirvent Raül1,Martorell Xavier12,Peña Antonio J.1

Affiliation:

1. Barcelona Supercomputing Center (BSC); Barcelona Spain

2. Universitat Politècnica de Catalunya; Barcelona Spain

Funder

European Union's Horizon 2020 research and innovation programme

Spanish Ministry of Economy and Competitiveness under the project Computación de Altas Prestaciones VII

Departament d'Innovació, Universitats i Empresa de la Generalitat de Catalunya, under project MPEXPAR

Models de Programació i Entorns d'Execució Paral-lels

Publisher

Wiley

Subject

Computational Theory and Mathematics,Computer Networks and Communications,Computer Science Applications,Theoretical Computer Science,Software

Reference18 articles.

1. Many-task computing on many-core architectures;Valero-Lara;Scalable Comput: Pract Exp,2016

2. The design and performance of batched BLAS on modern high-performance computing systems;Dongarra;Procedia Comput Sci,2017

3. cuHinesBatch: solving multiple hines systems on GPUs human brain project;Valero-Lara;Procedia Comput Sci,2017

4. Energy-efficient scheduling algorithms for batch-of-tasks (BoT) applications on heterogeneous computing systems;Sajid;Concurrency Computat Pract Exper,2016

5. Optimizing tridiagonal solvers for alternating direction methods on Boolean cube multiprocessors;Ho;SIAM J Sci Stat Comput,1990

Cited by 28 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. swPTS: an efficient parallel Thomas split algorithm for tridiagonal systems on Sunway manycore processors;The Journal of Supercomputing;2023-09-19

2. Integrating batched sparse iterative solvers for the collision operator in fusion plasma simulations on GPUs;Journal of Parallel and Distributed Computing;2023-08

3. Interactive Hair Simulation on the GPU using ADMM;Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Proceedings;2023-07-23

4. A Portable and Heterogeneous LU Factorization on IRIS;Euro-Par 2022: Parallel Processing Workshops;2023

5. LaRIS: Targeting Portability and Productivity for LAPACK Codes on Extreme Heterogeneous Systems by Using IRIS;2022 IEEE/ACM Redefining Scalability for Diversely Heterogeneous Architectures Workshop (RSDHA);2022-11

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3