Deep Learning-Driven Compiler Enhancements for Efficient Matrix Multiplication-Reference-Cited by-同舟云学术

Deep Learning-Driven Compiler Enhancements for Efficient Matrix Multiplication

Published:2024-07-01 Issue:2 Volume:3 Page:08-18
ISSN:3009-075X
Container-title:Journal of Computers, Mechanical and Management
language:
Short-container-title:J. Comput. Mech. Manag

Author:

Kumar Raunak,Negi Karma Chhering,Sharma Nitish Kumar,Gupta Priya

Abstract

Matrix multiplication is a fundamental operation in many computational fields, requiring optimization to handle increasing data sizes efficiently. In this paper, the implementation of Deep Learning in Matrix multiplication is reviewed, which is considered important nowadays due to the growing complexity of matrix multiplication for gaming and complex programs. The current standard matrix multiplication and the time taken by it on different matrix sizes are described. The Tiled Matrix multiplication, which trims the matrix into various pieces and calculates the product for each piece, and thereafter combines the result, is also described. The times taken by both methods for different matrix sizes were compared. The main idea was to use Deep Neural Networks (DNN) to compare and rank code variants that are obtained in pieces and determine their relative performance. A tournament-based ranking system is used for assigning ranks to the code versions. The effectiveness of these techniques was evaluated on various matrix multiplication operations commonly found in deep learning workloads. Up to 8.844x speedup over the naive implementation for a matrix size of 1024 is achieved by this approach. The results demonstrate the effectiveness of combining compiler optimization techniques and deep learning models in optimizing matrix multiplication.

Publisher

Global Academic Digital Library

Reference26 articles.

1. K. Datta, M. Murphy, V. Volkov, S. Williams, and J. Carter, “Stencil computations on multicore architectures,” ACM Transactions on Architecture and Code Optimization, vol. 5, no. 3, 2008.

2. P. Gupta, M. T., M. Purushotham, S. L. J., V. N. R., and S. Nanda, “Efficient compiler design for a geometric shape domain-specific language: Emphasizing abstraction and optimization techniques,” EAI Endorsed Transactions on Scalable Information Systems, 2024.

3. L. Sun, C. Tang, Y. Jiang, X. Lian, and J. Guo, “A comprehensive survey on matrix multiplication optimization techniques for GPU,” Journal of Systems Architecture, vol. 117, p. 102097, 2021.

4. W. Shao, J. Zhang, W. Jiang, and X. Song, “Design and optimization of a matrix multiplication module for a ray tracing processor,” Journal of Systems Architecture, vol. 96, pp. 1–12, 2019.

5. P. Gupta, L. Y. Kumar, S. J. V. V. M. S. D., D. C. Kumar, and M. M. V. Chalapathi, “Design of efficient programming language with lexer using $-prefixed identifier,” EAI Endorsed Transactions on Scalable Information Systems, vol. 11, no. 2, 2024.