Context-Free Language Reachability via Skewed Tabulation

Author:

Lei Yuxiang1ORCID,Bossut Camille2ORCID,Sui Yulei1ORCID,Zhang Qirun2ORCID

Affiliation:

1. University of New South Wales, Sydney, Australia

2. Georgia Institute of Technology, Atlanta, USA

Abstract

Context-free language reachability (CFL-reachability) is a prominent model for formulating program analysis problems. Almost all CFL-reachability algorithms are based on the Reps-Horwitz-Sagiv (RHS) tabulation. In essence, the RHS tabulation, based on normalized context-free grammars, is similar to the CYK algorithm for CFL-parsing. Consider a normalized rule S ::= A B and a CFL-reachability problem instance of computing S-edges in the input graph. The RHS tabulation obtains all summary edges (i.e., S-, A-, and B-edges) based on the grammar rules. However, many A- and B-edges are wasted because only a subset of those edges eventually contributes to generating S-edges in the input graph. This paper proposes a new tabulation strategy for speeding up CFL-reachability by eliminating wasted and unnecessary summary edges. We particularly focus on recursive nonterminals. Our key technical insight is that the wasted edge generations and insertions caused by recursive nonterminals can be avoided by modifying the parse trees either statically (by transforming the grammar) or dynamically (using a specialized online CFL-reachability solver). For example, if a recursive nonterminal B, generated by a rule B ::= B X, appears on the right-hand side of a rule S ::= A B, we can make S recursive (by introducing a new rule S ::= S X) and eliminate the original recursive rule (B ::= B X). Due to the rule S ::= S X, the shapes of the parse trees associated with the left-hand-side nonterminal S become more "skewed". Thus, we name our approach skewed tabulation for CFL-reachability. Skewed tabulation can significantly improve the scalability of CFL-reachability by reducing wasted and unnecessary summary edges. We have implemented skewed tabulation and applied the corresponding CFL-reachability algorithm to an alias analysis, a value-flow analysis, and a taint analysis. Our extensive evaluation based on SPEC 2017 benchmarks yields promising results. For the three client analyses, CFL-reachability based on skewed tabulation can achieve 3.34×, 1.13× and 2.05× speedup over the state-of-the-art RHS-tabulation-based CFL-reachability solver and consume 60.05%, 20.38% and 63.06% less memory, respectively. Furthermore, the cost of grammar transformation for skewed tabulation is negligible, typically taking less than one second.

Funder

Australian Research Grants

Google

NSF

Defense Advanced Research Projects Agency

Publisher

Association for Computing Machinery (ACM)

Reference45 articles.

1. Alfred V. Aho, Monica S. Lam, Ravi Sethi, and Jeffrey D. Ullman. 2006. Compilers: Principles, Techniques, and Tools (2nd Edition). Addison-Wesley Longman Publishing Co., Inc., USA. isbn:0321486811

2. Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. 1986. Compilers: Principles, Techniques, and Tools. Addison-Wesley.

3. François Bancilhon, David Maier, Yehoshua Sagiv, and Jeffrey D. Ullman. 1986. Magic Sets and Other Strange Ways to Implement Logic Programs. In Proceedings of the Fifth ACM SIGACT-SIGMOD Symposium on Principles of Database Systems (PODS). 1–15.

4. On the power of magic

5. Strictly declarative specification of sophisticated points-to analyses

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3