Tile size selection revisited

Author:

Mehta Sanyam1,Beeraka Gautham1,Yew Pen-Chung1

Affiliation:

1. University of Minnesota

Abstract

Loop tiling is a widely used loop transformation to enhance data locality and allow data reuse. In the tiled code, however, tiles of different sizes can lead to significant variation in performance. Thus, selection of an optimal tile size is critical to performance of tiled codes. In the past, tile size selection has been attempted using both static analytical and dynamic empirical (auto-tuning) models. Past work using static models assumed a direct-mapped cache for the purpose of analysis and thus proved to be less robust. On the other hand, the auto-tuning models involve an exhaustive search in a large space of tiled codes. In this article, we propose a new analytical model for tile size selection that leverages the high set associativity in modern caches to minimize conflict misses. Our tile size selection model targets data reuse in multiple levels of cache. In addition, it considers the interaction of tiling with the SIMD unit in modern processors in estimating the optimal tile size. We find that these factors, not considered in previous models, are critical in developing a robust model for tile size selection. We implement our tile size selection model in a polyhedral compiler and test it on 12 benchmark kernels using two different problem sizes. Our model outperforms the previous analytical models that are based on reusing data in a single level of cache and achieves an average performance improvement of 9.7% and 20.4%, respectively, over the best square (cubic) tiles for the two problem sizes. In addition, the tile size chosen by our tile size selection algorithm is similar to the best performing size obtained through an extensive search, validating the analytical model underlying the algorithm.

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Information Systems,Software

Cited by 16 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Modeling the Interplay between Loop Tiling and Fusion in Optimizing Compilers Using Affine Relations;ACM Transactions on Computer Systems;2023-11-30

2. A Methodology for Efficient Tile Size Selection for Affine Loop Kernels;International Journal of Parallel Programming;2022-05-23

3. Performance Analysis and Optimization with Little’s Law;2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS);2022-05

4. An Analytical Model for Loop Tiling Transformation;Lecture Notes in Computer Science;2022

5. IOOpt: automatic derivation of I/O complexity bounds for affine programs;Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation;2021-06-18

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3