Object Intersection Captures on Interactive Apps to Drive a Crowd-sourced Replay-based Compiler Optimization

Author:

Mpeis Paschalis1ORCID,Petoumenos Pavlos2,Hazelwood Kim3,Leather Hugh3

Affiliation:

1. University of Edinburgh, Manchester, England, UK

2. University of Manchester, UK

3. Facebook AI Research, Menlo Park, California, USA

Abstract

Traditional offline optimization frameworks rely on representative hardware, software, and inputs to compare different optimizations on. With application-specific optimization for mobile systems though, the idea of a representative testbench is unrealistic while creating offline inputs is non-trivial. Online approaches partially overcome these problems but they might expose users to suboptimal or even erroneous code. Therefore, our mobile code is poorly optimized, resulting in wasted performance and energy and user frustration. In this article, we introduce a novel compiler optimization approach designed for mobile applications. It requires no developer effort, it tunes applications for the user’s device and usage patterns, and it has no negative impact on the user experience. It is based on a lightweight capture and replay mechanism. Our previous work [ 46 ] captures the state accessed by any targeted code region during its online stage. By repurposing existing OS capabilities, it keeps the overhead low. In its offline stage, it replays the code region but under different optimization decisions to enable sound comparisons of different optimizations under realistic conditions. In this article, we propose a technique that further decreases the storage sizes without any additional overhead. It captures only the intersection of reachable objects and accessed heap pages. We compare this with another new approach that has minimal runtime overheads at the cost of higher capture sizes. Coupled with a search heuristic for the compiler optimization space, our capture and replay mechanism allows us to discover optimization decisions that improve performance without testing these decisions directly on the user. Finally, with crowd-sourcing we split this offline evaluation effort between several users, allowing us to discover better code in less time. We implemented a prototype system in Android based on LLVM combined with a genetic search engine and a crowd-sourcing architecture. We evaluated it on both benchmarks and real Android applications. Online captures are infrequent and introduce ~5 ms or 15 ms on average, depending on the approach used. For this negligible effect on user experience, we achieve speedups of 44%  on average over the Android compiler and 35% over LLVM -O3. Our collaborative search is just 5% short of that speedup, which is impressive given the acceleration gains. The user with the highest workload concluded the search 7 \( \times \) faster.

Funder

Royal Academy of Engineering

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Information Systems,Software

Reference58 articles.

1. OCEANS: Optimizing compilers for embedded applications

2. Using Machine Learning to Focus Iterative Optimization

3. The Algorithms. 2020. Sorting Algoirthms. Retrieved from https://github.com/TheAlgorithms/Java.

4. Finding effective compilation sequences

5. Android. 2020. Android Compiler: Pathological Cases It Cannot Compile. Retrieved from https://android.googlesource.com/platform/art/+/refs/tags/android-10.0.0_r11/compiler/compiler.cc#48.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3