Rapid: Region-Based Pointer Disambiguation

Author:

Chitre Khushboo1ORCID,Kedia Piyus1ORCID,Purandare Rahul2ORCID

Affiliation:

1. IIIT Delhi, Delhi, India

2. University of Nebraska-Lincoln, Lincoln, USA

Abstract

Interprocedural alias analyses often sacrifice precision for scalability. Thus, modern compilers such as GCC and LLVM implement more scalable but less precise intraprocedural alias analyses. This compromise makes the compilers miss out on potential optimization opportunities, affecting the performance of the application. Modern compilers implement loop-versioning with dynamic checks for pointer disambiguation to enable the missed optimizations. Polyhedral access range analysis and symbolic range analysis enable 𝑂 (1) range checks for non-overlapping of memory accesses inside loops. However, these approaches work only for the loops in which the loop bounds are loop invariants. To address this limitation, researchers proposed a technique that requires 𝑂 (𝑙𝑜𝑔 𝑛) memory accesses for pointer disambiguation. Others improved the performance of dynamic checks to single memory access by constraining the object size and alignment. However, the former approach incurs noticeable overhead due to its dynamic checks, whereas the latter has a noticeable allocator overhead. Thus, scalability remains a challenge. In this work, we present a tool, Rapid, that further reduces the overheads of the allocator and dynamic checks proposed in the existing approaches. The key idea is to identify objects that need disambiguation checks using a profiler and allocate them in different regions, which are disjoint memory areas. The disambiguation checks simply compare the regions corresponding to the objects. The regions are aligned such that the top 32 bits in the addresses of any two objects allocated in different regions are always different. As a consequence, the dynamic checks do not require any memory access to ensure that the objects belong to different regions, making them efficient. Rapid achieved a maximum performance benefit of around 52.94% for Polybench and 1.88% for CPU SPEC 2017 benchmarks. The maximum CPU overhead of our allocator is 0.57% with a geometric mean of -0.2% for CPU SPEC 2017 benchmarks. Due to the low overhead of the allocator and dynamic checks, Rapid could improve the performance of 12 out of 16 CPU SPEC 2017 benchmarks. In contrast, a state-of-the-art approach used in the comparison could improve only five CPU SPEC 2017 benchmarks.

Funder

TCS Foundation

Publisher

Association for Computing Machinery (ACM)

Subject

Safety, Risk, Reliability and Quality,Software

Reference43 articles.

1. 2016 (accessed Aug 12 2023). Tutorial-Perf Wiki. https://perf.wiki.kernel.org/index.php/Tutorial 2016 (accessed Aug 12 2023). Tutorial-Perf Wiki. https://perf.wiki.kernel.org/index.php/Tutorial

2. 2023 (accessed Apr 11 2023). Mimalloc source code. https://github.com/microsoft/mimalloc 2023 (accessed Apr 11 2023). Mimalloc source code. https://github.com/microsoft/mimalloc

3. 2023 (accessed Apr 11 2023). Runtime Checks of Pointers. https://llvm.org/docs/Vectorizers.html##runtime-checks-of-pointers 2023 (accessed Apr 11 2023). Runtime Checks of Pointers. https://llvm.org/docs/Vectorizers.html##runtime-checks-of-pointers

4. 2023 (accessed Sep 2 2023). Rapid Artifact Github Repository. https://github.com/khushboochitre/artifact_rapid.git 2023 (accessed Sep 2 2023). Rapid Artifact Github Repository. https://github.com/khushboochitre/artifact_rapid.git

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3