GRAM-Reference-Cited by-同舟云学术

GRAM

Published:2021-03 Issue:2 Volume:18 Page:1-24
ISSN:1544-3566
Container-title:ACM Transactions on Architecture and Code Optimization
language:en
Short-container-title:ACM Trans. Archit. Code Optim.

Author:

Ho Nhut-Minh¹,silva Himeshi De¹,Wong Weng-Fai¹

Affiliation:

1. National University of Singapore

Abstract

This article presents GRAM (<underline>G</underline>PU-based <underline>R</underline>untime <underline>A</underline>daption for <underline>M</underline>ixed-precision) a framework for the effective use of mixed precision arithmetic for CUDA programs. Our method provides a fine-grain tradeoff between output error and performance. It can create many variants that satisfy different accuracy requirements by assigning different groups of threads to different precision levels adaptively at runtime . To widen the range of applications that can benefit from its approximation, GRAM comes with an optional half-precision approximate math library. Using GRAM, we can trade off precision for any performance improvement of up to 540%, depending on the application and accuracy requirement.

Funder

Singapore Ministry of Education

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Information Systems,Software

Link

https://dl.acm.org/doi/pdf/10.1145/3441830

Reference51 articles.

1. Approximating sine functions using variable-precision Taylor polynomials

2. Rodinia: A benchmark suite for heterogeneous computing

3. Dynamic Precision Autotuning with TAFFO

4. Rigorous floating-point mixed-precision tuning

5. ApproxSymate: path sensitive program approximation using symbolic execution

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. MixPert: Optimizing Mixed-Precision Floating-Point Emulation on GPU Integer Tensor Cores;Proceedings of the 25th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems;2024-06-20

2. Interleaved Execution of Approximated CUDA Kernels in Iterative Applications;2024 32nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP);2024-03-20

3. Predicting Performance and Accuracy of Mixed-Precision Programs for Precision Tuning;Proceedings of the IEEE/ACM 46th International Conference on Software Engineering;2024-02-06

4. Towards a SYCL API for Approximate Computing;International Workshop on OpenCL;2023-04-18

5. FPChecker: Floating-Point Exception Detection Tool and Benchmark for Parallel and Distributed HPC;2022 IEEE International Symposium on Workload Characterization (IISWC);2022-11