Affiliation:
1. Intel Corporation, Santa Clara, CA
Abstract
Cache miss stalls hurt performance because of the large gap between memory and processor speeds - for example, the popular server benchmark SPEC JBB2000 spends 45% of its cycles stalled waiting for memory requests on the Itanium® 2 processor. Traversing linked data structures causes a large portion of these stalls. Prefetching for linked data structures remains a major challenge because serial data dependencies between elements in a linked data structure preclude the timely materialization of prefetch addresses. This paper presents
Mississippi Delta
(MS Delta), a novel technique for prefetching linked data structures that closely integrates the hardware performance monitor (HPM), the garbage collector's global view of heap and object layout, the type-level metadata inherent in type-safe programs, and JIT compiler analysis. The garbage collector uses the HPM's data cache miss information to identify cache miss intensive traversal paths through linked data structures, and then discovers regular distances (
deltas
) between these linked objects. JIT compiler analysis injects prefetch instructions using deltas to materialize prefetch addresses.We have implemented MS Delta in a fully dynamic profile-guided optimization system: the StarJIT dynamic compiler [1] and the ORP Java virtual machine [9]. We demonstrate a 28-29% reduction in stall cycles attributable to the high-latency cache misses targeted by MS Delta and a speedup of 11-14% on the cache miss intensive SPEC JBB2000 benchmark.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design,Software
Cited by
16 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. MaPHeA: A Framework for Lightweight Memory Hierarchy-aware Profile-guided Heap Allocation;ACM Transactions on Embedded Computing Systems;2022-12-13
2. OJXPerf;Proceedings of the 44th International Conference on Software Engineering;2022-05-21
3. MaPHeA: a lightweight memory hierarchy-aware profile-guided heap allocation framework;Proceedings of the 22nd ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems;2021-06-22
4. MxTasks: How to Make Efficient Synchronization and Prefetching Easy;Proceedings of the 2021 International Conference on Management of Data;2021-06-09
5. Improving program locality in the GC using hotness;Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation;2020-06-06