An Autotuning Protocol to Rapidly Build Autotuners-Reference-Cited by-同舟云学术

An Autotuning Protocol to Rapidly Build Autotuners

Published:2019-01-23 Issue:2 Volume:5 Page:1-25
ISSN:2329-4949
Container-title:ACM Transactions on Parallel Computing
language:en
Short-container-title:ACM Trans. Parallel Comput.

Author:

Liu Junhong¹,Tan Guangming¹,Luo Yulong¹,Li Jiajia²,Mo Zeyao³,Sun Ninghui¹

Affiliation:

1. State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences

2. Computational Science and Engineering, Georgia Institute of Technology

3. Institute of Applied Physics and Computational Mathematics

Abstract

Automatic performance tuning (Autotuning) is an increasingly critical tuning technique for the high portable performance of Exascale applications. However, constructing an autotuner from scratch remains a challenge, even for domain experts. In this work, we propose a performance tuning and knowledge management suite (PAK) to help rapidly build autotuners. In order to accommodate existing autotuning techniques, we present an autotuning protocol that is composed of an extractor, producer, optimizer, evaluator, and learner. To achieve modularity and reusability, we also define programming interfaces for each protocol component as the fundamental infrastructure, which provides a customizable mechanism to deploy knowledge mining in the performance database. PAK’s usability is demonstrated by studying two important computational kernels: stencil computation and sparse matrix-vector multiplication (SpMV). Our proposed autotuner based on PAK shows comparable performance and higher productivity than traditional autotuners by writing just a few tens of code using our autotuning protocol.

Funder

National Natural Science Foundation of China

National Key Research and Development Program of China

Publisher

Association for Computing Machinery (ACM)

Subject

Computational Theory and Mathematics,Computer Science Applications,Hardware and Architecture,Modeling and Simulation,Software

Link

https://dl.acm.org/doi/pdf/10.1145/3291527

Reference57 articles.

1. Scheduling FFT computation on SMP and multicore systems

2. PetaBricks

3. OpenTuner

4. Towards making autotuning mainstream

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. CodeSeer;Proceedings of the 34th ACM International Conference on Supercomputing;2020-06-29

2. Register-Aware Optimizations for Parallel Sparse Matrix–Matrix Multiplication;International Journal of Parallel Programming;2019-01-01