AAQAL: A Machine Learning-Based Tool for Performance Optimization of Parallel SPMV Computations Using Block CSR-Reference-Cited by-同舟云学术

AAQAL: A Machine Learning-Based Tool for Performance Optimization of Parallel SPMV Computations Using Block CSR

Published:2022-07-13 Issue:14 Volume:12 Page:7073
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Ahmed Muhammad,Usman Sardar,Shah Nehad Ali^ORCID,Ashraf M. Usman^ORCID,Alghamdi Ahmed Mohammed^ORCID,Bahadded Adel A.^ORCID,Almarhabi Khalid Ali^ORCID

Abstract

The sparse matrix–vector product (SpMV), considered one of the seven dwarfs (numerical methods of significance), is essential in high-performance real-world scientific and analytical applications requiring solution of large sparse linear equation systems, where SpMV is a key computing operation. As the sparsity patterns of sparse matrices are unknown before runtime, we used machine learning-based performance optimization of the SpMV kernel by exploiting the structure of the sparse matrices using the Block Compressed Sparse Row (BCSR) storage format. As the structure of sparse matrices varies across application domains, optimizing the block size is important for reducing the overall execution time. Manual allocation of block sizes is error prone and time consuming. Thus, we propose AAQAL, a data-driven, machine learning-based tool that automates the process of data distribution and selection of near-optimal block sizes based on the structure of the matrix. We trained and tested the tool using different machine learning methods—decision tree, random forest, gradient boosting, ridge regressor, and AdaBoost—and nearly 700 real-world matrices from 43 application domains, including computer vision, robotics, and computational fluid dynamics. AAQAL achieved 93.47% of the maximum attainable performance with a substantial difference compared to in practice manual or random selection of block sizes. This is the first attempt at exploiting matrix structure using BCSR, to select optimal block sizes for the SpMV computations using machine learning techniques.

Funder

King Abdulaziz University

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/12/14/7073/pdf

Reference43 articles.

1. Performance Optimization of SpMV on Spark;Xie;Proceedings of the 2019 IEEE International Conference on Big Data (Big Data),2019

2. Midgar: Detection of people through computer vision in the Internet of Things scenarios to improve the security in Smart Cities, Smart Towns, and Smart Homes

3. Cloud-Enhanced Robotic System for Smart City Crowd Control

4. 3D Design and Modeling of Smart Cities from a Computer Graphics Perspective

5. Crowd-sensing our Smart Cities: a Platform for Noise Monitoring and Acoustic Urban Planning

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Revisiting thread configuration of SpMV kernels on GPU: A machine learning based approach;Journal of Parallel and Distributed Computing;2024-03

2. Applying sustainable development goals in financial forecasting using machine learning techniques;Corporate Social Responsibility and Environmental Management;2023-12-12

3. Leveraging Memory Copy Overlap for Efficient Sparse Matrix-Vector Multiplication on GPUs;Electronics;2023-08-31

4. Vision-Based Semantic Segmentation in Scene Understanding for Autonomous Driving: Recent Achievements, Challenges, and Outlooks;IEEE Transactions on Intelligent Transportation Systems;2022-12