Affiliation:
1. The Pennsylvania State University, University Park, PA, USA
Abstract
Caches play a critical role in today's computer systems and optimizing their performance has been a critical objective in the last couple of decades. Unfortunately, compared to a plethora of work in software and hardware directed code/data optimizations, much less effort has been spent in understanding the fundamental characteristics of data access patterns exhibited by application programs and their interaction with the underlying cache hardware. Therefore, in general it is hard to reason about cache behavior of a program running on a target system. Motivated by this observation, we first set up a "locality model" that can help us determine the theoretical bounds of the cache misses caused by irregular data accesses. We then explain how this locality model can be used for different data locality optimization purposes. After that, based on our model, we propose a data reordering (data layout reorganization) scheme that can be applied after any existing data reordering schemes for irregular applications to improve cache performance by further reducing the cache misses. We evaluate the effectiveness of our scheme using a set of 8 programs with irregular data accesses, and show that it brings significant improvements over the state-of-the-art on two commercial multicore machines.
Funder
National Science Foundation
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Networks and Communications,Hardware and Architecture,Software
Reference36 articles.
1. Intel pentium 4 and intel xeon processor optimization reference manual. http://developer.intel.com. Intel pentium 4 and intel xeon processor optimization reference manual. http://developer.intel.com.
2. Intel Xscale core developer's manual. http://developer.intel.com. Intel Xscale core developer's manual. http://developer.intel.com.
3. SUIF2. http://suif.stanford.edu. SUIF2. http://suif.stanford.edu.
4. Data and computation transformations for multiprocessors
5. A Partitioning Strategy for Nonuniform Problems on Multiprocessors
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Efficient approximations for cache-conscious data placement;Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation;2022-06-09
2. Huron: hybrid false sharing detection and repair;Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation;2019-06-08
3. Efficient parameterized algorithms for data packing;Proceedings of the ACM on Programming Languages;2019-01-02