Effective Loop Fusion in Polyhedral Compilation Using Fusion Conflict Graphs

Author:

Acharya Aravind1,Bondhugula Uday1,Cohen Albert2ORCID

Affiliation:

1. Indian Institute of Science, India

2. Google, France

Abstract

Polyhedral auto-transformation frameworks are known to find efficient loop transformations that maximize locality and parallelism and minimize synchronization. While complex loop transformations are routinely modeled in these frameworks, they tend to rely on ad hoc heuristics for loop fusion. Although there exist multiple loop fusion models with cost functions to maximize locality and parallelism, these models involve separate optimization steps rather than seamlessly integrating with other loop transformations like loop permutation, scaling, and shifting. Incorporating parallelism-preserving loop fusion heuristics into existing affine transformation frameworks like Pluto, LLVM-Polly, PPCG, and PoCC requires solving a large number of Integer Linear Programming formulations, which increase auto-transformation times significantly. In this work, we incorporate polynomial time loop fusion heuristics into the Pluto-lp-dfp framework. We present a data structure called the fusion conflict graph (FCG), which enables us to efficiently model loop fusion in the presence of other affine loop transformations. We propose a clustering heuristic to group the vertices of the FCG, which further enables us to provide three different polynomial time greedy fusion heuristics, namely, maximal fusion , typed fusion , and hybrid fusion , while maintaining the compile time improvements of Pluto-lp-dfp over Pluto. Our experiments reveal that the hybrid fusion model, in conjunction with Pluto’s cost function, finds efficient transformations that outperform PoCC and Pluto by mean factors of 1.8× and 1.07×, respectively, with a maximum performance improvement of 14× over PoCC and 2.6× over Pluto.

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Information Systems,Software

Cited by 7 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Register Blocking: An Analytical Modelling Approach for Affine Loop Kernels;Proceedings of the 21st ACM International Conference on Computing Frontiers;2024-05-07

2. PolyTOPS: Reconfigurable and Flexible Polyhedral Scheduler;2024 IEEE/ACM International Symposium on Code Generation and Optimization (CGO);2024-03-02

3. Modeling the Interplay between Loop Tiling and Fusion in Optimizing Compilers Using Affine Relations;ACM Transactions on Computer Systems;2023-11-30

4. Arbitrarily Parallelizable Code: A Model of Computation Evaluated on a Message-Passing Many-Core System;Computers;2022-11-18

5. Deep reinforcement learning in loop fusion problem;Neurocomputing;2022-04

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3