Affiliation:
1. MPRC, School of Computer Science, Peking University, Beijing, P.R. China
Abstract
Page-based virtual memory relies on TLBs to accelerate the address translation. Nowadays, the gap between application workloads and the capacity of TLB continues to grow, bringing many costly TLB misses and making the TLB a performance bottleneck. Previous studies seek to narrow the gap by exploiting the contiguity of physical pages. One promising solution is to group pages that are both virtually and physically contiguous into a memory range. Recording range translations can greatly increase the TLB reach, but ranges are also hard to index because they have arbitrary bounds. The processor has to compare against all the boundaries to determine which range an address falls in, which restricts the usage of memory ranges.In this article, we propose a tagged-pointer-based scheme, FlexPointer, to solve the range indexing problem. The core insight of FlexPointer is that large memory objects are rare, so we can create memory ranges based on such objects and assign each of them a unique ID. With the range ID integrated into pointers, we can index the range TLB with IDs and greatly simplify its structure. Moreover, because the ID is stored in the unused bits of a pointer and is not manipulated by the address generation, we can shift the range lookup to an earlier stage, working in parallel with the address generation. According to our trace-based simulation results, FlexPointer can reduce nearly all the L1 TLB misses, and page walks for a variety of memory-intensive workloads. Compared with a 4K-page baseline system, FlexPointer shows a 14% performance improvement on average and up to 2.8x speedup in the best case. For other workloads, FlexPointer shows no performance degradation.
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Information Systems,Software
Reference62 articles.
1. Mohammad Agbarya, Idan Yaniv, Jayneel Gandhi, and Dan Tsafrir. 2020. Predicting execution times with partial simulations in virtual memory research: Why and how. In Proceedings of the MICRO’20. 456–470.
2. Do-It-Yourself Virtual Memory Translation
3. Chloe Alverti, Stratos Psomadakis, Vasileios Karakostas, Jayneel Gandhi, Konstantinos Nikas, Georgios Goumas, and Nectarios Koziris. 2020. Enhancing and exploiting contiguity for fast memory virtualization. In Proceedings of the ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA’20). IEEE Press, 515–528.
4. Rachata Ausavarungnirun, Timothy Merrifield, Jayneel Gandhi, and Christopher J. Rossbach. 2020. PRISM: Architectural support for variable-granularity memory metadata. In Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques (PACT’20). Association for Computing Machinery, New York, NY, 441–454.
5. Thomas W. Barr, Alan L. Cox, and Scott Rixner. 2010. Translation caching: Skip, don’t walk (the page table). In Proceedings of the 37th Annual International Symposium on Computer Architecture (ISCA’10). Association for Computing Machinery, New York, NY, 48–59.