Affiliation:
1. Princeton Univ., Princeton, NJ
Abstract
With the ever-widening performance gap between processors and main memory, cache memory, which is used to bridge this gap, is becoming more and more significant. Caches work well for programs that exhibit sufficient locality. Other programs, however, have reference patterns that fail to exploit the cache, thereby suffering heavily from high memory latency. In order to get high cache efficiency and achieve good program performance, efficient memory accessing behavior is necessary. In fact, for many programs, program transformations or source-code changes can radically alter memory access patterns, significantly improving cache performance. Both hand-tuning and compiler optimization techniques are often used to transform codes to improve cache utilization. Unfortunately, cache conflicts are difficult to predict and estimate, precluding effective transformations. Hence, effective transformations require detailed knowledge about the frequency and causes of cache misses in the code. This article describes methods for generating and solving Cache Miss Equations (CMEs) that give a detailed representation of cache behavior, including conflict misses, in loop-oriented scientific code. Implemented within the SUIF compiler framework, our approach extends traditional compiler reuse analysis to generate linear Diophantine equations that summarize each loop's memory behavior. While solving these equations is in general difficult, we show that is also unnecessary, as mathematical techniques for manipulating Diophantine equations allow us to relatively easily compute and/or reduce the number of possible solutions, where each solution corresponds to a potential cache miss. The mathematical precision of CMEs allows us to find true optimal solutions for transformations such as blocking or padding. The generality of CMEs also allows us to reason about interactions between transformations applied in concert. The article also gives examples of their use to determine array padding
and offset amounts that minimize cache misses, and to determine optimal blocking factors for tiled code. Overall, these equations represent an analysis framework that offers the generality and precision needed for detailed compiler optimizations.
Publisher
Association for Computing Machinery (ACM)
Reference36 articles.
1. Adler A. and Coury J. E. 1995. The Theory of Numbers: A Text and Source Book of Problems. Jones and Bartlett Publishers Boston MA. Adler A. and Coury J. E. 1995. The Theory of Numbers: A Text and Source Book of Problems. Jones and Bartlett Publishers Boston MA.
2. Automatic translation of FORTRAN programs to vector form
3. Banerjee U. 1993. Loop transformations for Restructuring Compilers. Kluwer Academic Publishers Norwell MA. Banerjee U. 1993. Loop transformations for Restructuring Compilers. Kluwer Academic Publishers Norwell MA.
Cited by
128 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Leveraging LLVM's ScalarEvolution for Symbolic Data Cache Analysis;2023 IEEE Real-Time Systems Symposium (RTSS);2023-12-05
2. BullsEye
: Scalable and Accurate Approximation Framework for Cache Miss Calculation;ACM Transactions on Architecture and Code Optimization;2022-11-17
3. Warping cache simulation of polyhedral programs;Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation;2022-06-09
4. CARL: Compiler Assigned Reference Leasing;ACM Transactions on Architecture and Code Optimization;2022-03-17
5. Intelligent Resource Provisioning for Scientific Workflows and HPC;2021 IEEE Workshop on Workflows in Support of Large-Scale Science (WORKS);2021-11