Affiliation:
1. Indian Institute of Technology Delhi, India
Abstract
The unprecedented growth of edge computing and 5G has led to an increased offloading of mobile applications to cloud servers or edge cloudlets.
1
The most prominent workloads comprise computer vision applications. Conventional wisdom suggests that computer vision workloads perform significantly well on SIMD/SIMT architectures such as GPUs owing to the dominance of linear algebra kernels in their composition. In this work, we debunk this popular belief by performing a lot of experiments with the concurrent execution of these workloads, which is the most popular pattern in which these workloads are executed on cloud servers. We show that the performance of these applications on GPUs does not scale well with an increase in the number of concurrent applications primarily because of contention at the shared resources and lack of efficient virtualization techniques for GPUs. Hence, there is a need to accurately predict the performance and power of such ensemble workloads on a GPU. Sadly, most of the prior work in the area of performance/power prediction is for only a single application. To the best of our knowledge, we propose the first machine learning-based predictor to predict the performance and power of an ensemble of applications on a GPU. In this article, we show that by using the execution statistics of stand-alone workloads and the fairness of execution when these workloads are executed with three representative microbenchmarks, we can get a reasonably accurate prediction. This is the first such work in the direction of performance and power prediction for concurrent applications that does not rely on the features extracted from concurrent executions or GPU profiling data. Our predictors achieve an accuracy of 91% and 96% in estimating the performance and power of executing two applications concurrently, respectively. We also demonstrate a method to extend our models to four or five concurrently running applications on modern GPUs.
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Information Systems,Software
Reference44 articles.
1. Offloading in fog computing for IoT: Review, enabling technologies, and research opportunities
2. HeteroMap: A Runtime Performance Predictor for Efficient Processing of Graph Analytics on Heterogeneous Multi-Accelerators
3. Cross-architecture performance prediction (XAPP) using CPU code to predict GPU performance
4. Understanding the Future of Energy Efficiency in Multi-Module GPUs
5. Rachata Ausavarungnirun, Vance Miller, Joshua Landgraf, Saugata Ghose, Jayneel Gandhi, Adwait Jog, Christopher J. Rossbach, and Onur Mutlu. 2018. Mask: Redesigning the gpu memory hierarchy to support multi-application concurrency. In ACM SIGPLAN Notices, Vol. 53. ACM, 503–518.
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. PredATW: Predicting the Asynchronous Time Warp Latency For VR Systems;ACM Transactions on Embedded Computing Systems;2024-08-14
2. Analyzing GPU Energy Consumption in Data Movement and Storage;2024 IEEE 35th International Conference on Application-specific Systems, Architectures and Processors (ASAP);2024-07-24
3. Agnostic Energy Consumption Models for Heterogeneous GPUs in Cloud Computing;Applied Sciences;2024-03-12
4. Program Analysis and Machine Learning–based Approach to Predict Power Consumption of CUDA Kernel;ACM Transactions on Modeling and Performance Evaluation of Computing Systems;2023-07-24
5. AMPeD: An Analytical Model for Performance in Distributed Training of Transformers;2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS);2023-04