Architecture-Adaptive Code Variant Tuning-Reference-Cited by-同舟云学术

Architecture-Adaptive Code Variant Tuning

Published:2016-07-29 Issue:2 Volume:44 Page:325-338
ISSN:0163-5964
Container-title:ACM SIGARCH Computer Architecture News
language:en
Short-container-title:SIGARCH Comput. Archit. News

Author:

Muralidharan Saurav¹,Roy Amit¹,Hall Mary¹,Garland Michael²,Rai Piyush³

Affiliation:

1. University of Utah, Salt Lake City, UT, USA

2. NVIDIA Corporation, Santa Clara, CA, USA

3. IIT Kanpur, Kanpur, India

Abstract

Code variants represent alternative implementations of a computation, and are common in high-performance libraries and applications to facilitate selecting the most appropriate implementation for a specific execution context (target architecture and input dataset). Automating code variant selection typically relies on machine learning to construct a model during an offline learning phase that can be quickly queried at runtime once the execution context is known. In this paper, we define a new approach called architecture-adaptive code variant tuning, where the variant selection model is learned on a set of source architectures, and then used to predict variants on a new target architecture without having to repeat the training process. We pose this as a multi-task learning problem, where each source architecture corresponds to a task; we use device features in the construction of the variant selection model. This work explores the effectiveness of multi-task learning and the impact of different strategies for device feature selection. We evaluate our approach on a set of benchmarks and a collection of six NVIDIA GPU architectures from three distinct generations. We achieve performance results that are mostly comparable to the previous approach of tuning for a single GPU architecture without having to repeat the learning phase.

Funder

Defense Advanced Research Projects Agency

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/2980024.2872411

Reference38 articles.

1. PetaBricks

2. S. Baxter. Modern GPU library. http://nvlabs.github.io/moderngpu/. S. Baxter. Modern GPU library. http://nvlabs.github.io/moderngpu/.

3. Implementing sparse matrix-vector multiplication on throughput-oriented processors

4. Towards Low-Cost, High-Accuracy Classifiers for Linear Solver Selection

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Using hardware performance counters to speed up autotuning convergence on GPUs;Journal of Parallel and Distributed Computing;2022-02

2. A Survey of Performance Tuning Techniques and Tools for Parallel Applications;IEEE Access;2022

3. The Behavioral Diversity of Java JSON Libraries;2021 IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE);2021-10

4. Exploiting historical data: Pruning autotuning spaces and estimating the number of tuning steps;Concurrency and Computation: Practice and Experience;2020-08-10

5. A benchmark set of highly-efficient CUDA and OpenCL kernels and its dynamic autotuning with Kernel Tuning Toolkit;Future Generation Computer Systems;2020-07