Affiliation:
1. University of Science and Technology of China, Hefei, China
Abstract
The ability to model and predict the execution time of GPU computations is crucial for real-time graphics application development and optimization. While there are many existing methodologies for graphics programmers to provide such estimates, those methods are often vendor-dependent, require the platforms to be tested, or fail to capture the contextual influences among shader instructions. To address this challenge, we propose ShaderPerFormer, a platform-independent, context-aware deep-learning approach to model GPU performance and provide end-to-end performance predictions on a per-shader basis. To provide more accurate predictions, our method contains a separate stage to gather platform-independent shader program trace information. We also provide a dataset consisting of a total of 54,667 fragment shader performance samples on 5 different platforms. Compared to the PILR and SH baseline methods, our approach reduces the average MAPE across five platforms by 8.26% and 25.25%, respectively.
Funder
National Key Research and Development Program of China
National Natural Science Foundation of China
Publisher
Association for Computing Machinery (ACM)
Reference46 articles.
1. 2019. Talvos: A dynamic-analysis framework and debugger for Vulkan/SPIR-V programs. https://github.com/talvos/talvos. Accessed: 2024-01-01.
2. uiCA
3. Facile: Fast, Accurate, and Interpretable Basic-Block Throughput Prediction
4. AMD. 2023. Radeon Graphics Profiler. https://gpuopen.com/rgp/. Accessed: 2024-01-01.
5. Analyzing CUDA workloads using a detailed GPU simulator