Advanced computer architecture optimization for machine learning/deep learning

Author:

Meda Shefqet,Domazet Ervin

Abstract

Abstract The recent progress in Machine Learning (Géron, 2022) and particularly Deep Learning (Goodfellow, 2016) models exposed the limitations of traditional computer architectures. Modern algorithms demonstrate highly increased computational demands and data requirements that most existing architectures cannot handle efficiently. These demands result in training speed, inference latency, and power consumption bottlenecks, which is why advanced methods of computer architecture optimization are required to enable the development of ML/DL-dedicated efficient hardware platforms (Engineers, 2019). The optimization of computer architecture for applications of ML/DL becomes critical, due to the tremendous demand for efficient execution of complex computations by Neural Networks (Goodfellow, 2016). This paper reviewed the numerous approaches and methods utilized to optimize computer architecture for ML/DL workloads. The following sections contain substantial discussion concerning the hardware-level optimizations, enhancements of traditional software frameworks and their unique versions, and innovative explorations of architectures. In particular, we discussed hardware including specialized accelerators, which can improve the performance and efficiency of a computation system using various techniques, specifically describing accelerators like CPUs (multicore) (Hennessy, 2017), GPUs (Hwu, 2015) and TPUs (Contributors, 2017), parallelism in multicore architectures, data movement in hardware systems, especially techniques such as caching and sparsity, compression, and quantization, other special techniques and configurations, such as using specialized data formats, and measurement sparsity. Moreover, this paper provided a comprehensive analysis of current trends in software frameworks, Data Movement optimization strategies (A.Bienz, 2021), sparsity, quantization and compression methods, using ML for architecture exploration, and, DVFS (Hennessy, 2017),, which provides strategies for maximizing hardware utilization and power consumption during training, machine learning, dynamic voltage, and frequency scaling, runtime systems. Finally, the paper discussed research opportunity directions and the possibilities of computer architecture optimization influence in various industrial and academic areas of ML/DL technologies. The objective of implementing these optimization techniques is to largely minimize the current gap between the computational needs of ML/DL algorithms and the current hardware’s capability. This will lead to significant improvements in training times, enable real-time inference for various applications, and ultimately unlock the full potential of cutting-edge machine learning algorithms.

Publisher

Canadian Institute of Technology

Reference26 articles.

1. A.Bienz, L. N. (2021). Modeling Data Movement Performance on Heterogeneous Architectures. IEEE High Performance Extreme Computing Conference (HPEC) (pp. 1-7). Waltham, MA, USA: Institute of Electrical and Electronics Engineers Inc.

2. Abadi, M. B. (2016). TensorFlow: A System for Large-Scale Machine Learning. 12th USENIX Symposium on Operating Systems Design and Implementation , 265–283.

3. apache.org. (2024). APACHE MXNET:A FLEXIBLE AND EFFICIENT LIBRARY FOR DEEP LEARNING. Retrieved from https://mxnet.apache.org/versions/1.9.1/

4. Brandon Reagen, R. A.-Y. (2017). Deep Learning for Computer Architects. In P. U. Margaret Martonosi, Synthesis Lectures on Computer Architecture. Springer Nature Switzerland.

5. Contributors. (2017, June 26). In-Datacenter Performance Analysis of a Tensor Processing Unit . Retrieved from https://arxiv.org/pdf/1704.04760

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3