Affiliation:
1. National University of Singapore
Abstract
This article presents GRAM (<underline>G</underline>PU-based <underline>R</underline>untime <underline>A</underline>daption for <underline>M</underline>ixed-precision) a framework for the effective use of mixed precision arithmetic for CUDA programs. Our method provides a fine-grain tradeoff between output error and performance. It can create many variants that satisfy different accuracy requirements by assigning different groups of threads to different precision levels
adaptively at runtime
. To widen the range of applications that can benefit from its approximation, GRAM comes with an optional half-precision approximate math library. Using GRAM, we can trade off precision for any performance improvement of up to 540%, depending on the application and accuracy requirement.
Funder
Singapore Ministry of Education
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Information Systems,Software
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. MixPert: Optimizing Mixed-Precision Floating-Point Emulation on GPU Integer Tensor Cores;Proceedings of the 25th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems;2024-06-20
2. Interleaved Execution of Approximated CUDA Kernels in Iterative Applications;2024 32nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP);2024-03-20
3. Predicting Performance and Accuracy of Mixed-Precision Programs for Precision Tuning;Proceedings of the IEEE/ACM 46th International Conference on Software Engineering;2024-02-06
4. Towards a SYCL API for Approximate Computing;International Workshop on OpenCL;2023-04-18
5. FPChecker: Floating-Point Exception Detection Tool and Benchmark for Parallel and Distributed HPC;2022 IEEE International Symposium on Workload Characterization (IISWC);2022-11