Experiences with nested parallelism in task-parallel applications using malleable BLAS on multicore processors-Reference-Cited by-同舟云学术

Experiences with nested parallelism in task-parallel applications using malleable BLAS on multicore processors

Published:2023-03-10 Issue: Volume: Page:109434202311576
ISSN:1094-3420
Container-title:The International Journal of High Performance Computing Applications
language:en
Short-container-title:The International Journal of High Performance Computing Applications

Author:

Rodríguez-Sánchez Rafael¹^ORCID,Castelló Adrián²,Catalán Sandra¹,Igual Francisco D.¹,Quintana-Ortí Enrique S.²^ORCID

Affiliation:

1. Departamento Arquitectura de Computadores y Automática, Facultad de Ciencias Físicas - Desp. 230, Universidad Complutense de Madrid, Spain

2. Departamento de Informática de Sistemas y Computadores, Universitat Politècnica de València, Spain

Abstract

Malleability is defined as the ability to vary the degree of parallelism at runtime, and is regarded as a means to improve core occupation on state-of-the-art multicore processors tshat contain tens of computational cores per socket. This property is especially interesting for applications consisting of irregular workloads and/or divergent executions paths. The integration of malleability in high-performance instances of the Basic Linear Algebra Subprograms (BLAS) is currently nonexistent, and, in consequence, applications relying on these computational kernels cannot benefit from this capability. In response to this scenario, in this paper we demonstrate that significant performance benefits can be gathered via the exploitation of malleability in a framework designed to implement portable and high-performance BLAS-like operations. For this purpose, we integrate malleability within the BLIS library, and provide an experimental evaluation of the result on three different practical use cases.

Funder

Generalitat Valenciana

Comunidad de Madrid

Ministerio de Ciencia, InnovaciÃ&z.hfl;Ân y Universidades

Universidad Complutense de Madrid

Publisher

SAGE Publications

Subject

Hardware and Architecture,Theoretical Computer Science,Software

Link

http://journals.sagepub.com/doi/pdf/10.1177/10943420231157653

Reference28 articles.

1. StarPU: a unified platform for task scheduling on heterogeneous multicore architectures

2. Parallelizing dense and banded linear algebra libraries using SMPSs

3. Matrix inversion on CPU-GPU platforms with applications in control theory

4. A class of parallel tiled linear algebra algorithms for multicore architectures

5. Programming parallel dense matrix factorizations with look-ahead and OpenMP

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Malleability techniques applications in high-performance computing;The International Journal of High Performance Computing Applications;2024-03