Optimizing the Performance of the Sparse Matrix–Vector Multiplication Kernel in FPGA Guided by the Roofline Model

Author:

Favaro Federico1ORCID,Dufrechou Ernesto2,Oliver Juan P.1ORCID,Ezzatti Pablo2

Affiliation:

1. Instituto de Ingeniería Eléctrica, Facultad de Ingeniería, Universidad de la República, Montevideo 11300, Uruguay

2. Instituto de Computación, Facultad de Ingeniería, Universidad de la República, Montevideo 11300, Uruguay

Abstract

The widespread adoption of massively parallel processors over the past decade has fundamentally transformed the landscape of high-performance computing hardware. This revolution has recently driven the advancement of FPGAs, which are emerging as an attractive alternative to power-hungry many-core devices in a world increasingly concerned with energy consumption. Consequently, numerous recent studies have focused on implementing efficient dense and sparse numerical linear algebra (NLA) kernels on FPGAs. To maximize the efficiency of these kernels, a key aspect is the exploration of analytical tools to comprehend the performance of the developments and guide the optimization process. In this regard, the roofline model (RLM) is a well-known graphical tool that facilitates the analysis of computational performance and identifies the primary bottlenecks of a specific software when executed on a particular hardware platform. Our previous efforts advanced in developing efficient implementations of the sparse matrix–vector multiplication (SpMV) for FPGAs, considering both speed and energy consumption. In this work, we propose an extension of the RLM that enables optimizing runtime and energy consumption for NLA kernels based on sparse blocked storage formats on FPGAs. To test the power of this tool, we use it to extend our previous SpMV kernels by leveraging a block-sparse storage format that enables more efficient data access.

Funder

Universidad de la República, Uruguay and Agencia Nacional de Investigación e Innovación

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Mechanical Engineering,Control and Systems Engineering

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3