Affiliation:
1. University of Technology Sydney, Australia
2. Georgia Institute of Technology, USA
Abstract
Given an edge-labeled graph, context-free language reachability (CFL-reachability) computes reachable node pairs by deriving new edges and adding them to the graph. The redundancy that limits the scalability of CFL-reachability manifests as redundant derivations, i.e., identical edges can be derived multiple times due to the many paths between two reachable nodes. We observe that most redundancy arises from the derivations involving transitive relations of reachable node pairs. Unfortunately, existing techniques for reducing redundancy in transitive-closure-based problems are either ineffective or inapplicable to identifying and eliminating redundant derivations during on-the-fly CFL-reachability solving.
This paper proposes a scalable yet precision-preserving approach to all-pairs CFL-reachability analysis by taming its transitive redundancy. Our key insight is that transitive relations are intrinsically ordered, and utilizing the order for edge derivation can avoid most redundancy. To address the challenges in determining the derivation order from the dynamically changed graph during CFL-reachability solving, we introduce a hybrid graph representation by combining spanning trees and adjacency lists, together with a dynamic construction algorithm. Based on this representation, we propose a fast and effective partially ordered algorithm POCR to boost the performance of CFL-reachability analysis by reducing its transitive redundancy during on-the-fly solving. Our experiments on context-sensitive value-flow analysis and field-sensitive alias analysis for C/C++ demonstrate the promising performance of POCR. On average, POCR eliminates 98.50% and 97.26% redundant derivations respectively for the value-flow and alias analysis, achieving speedups of 21.48× and 19.57× over the standard CFL-reachability algorithm. We also compare POCR with two recent open-source tools, Graspan (a CFL-reachability solver) and Soufflé (a Datalog engine). The results demonstrate that POCR is over 3.67× faster than Graspan and Soufflé on average for both value-flow analysis and alias analysis.
Funder
Australian Research Grants
United States National Science Foundations
Defense Advanced Research Projects Agency
Publisher
Association for Computing Machinery (ACM)
Subject
Safety, Risk, Reliability and Quality,Software
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Better Not Together: Staged Solving for Context-Free Language Reachability;Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis;2024-09-11
2. Context-Free Language Reachability via Skewed Tabulation;Proceedings of the ACM on Programming Languages;2024-06-20
3. Dynamic Transitive Closure-based Static Analysis through the Lens of Quantum Search;ACM Transactions on Software Engineering and Methodology;2024-06-04
4. Iterative-Epoch Online Cycle Elimination for Context-Free Language Reachability;Proceedings of the ACM on Programming Languages;2024-04-29
5. Two Birds with One Stone: Multi-Derivation for Fast Context-Free Language Reachability Analysis;2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE);2023-09-11