Affiliation:
1. Microsoft, Redmond, WA, USA
2. Microsoft, Bangalore, India
Abstract
This paper describes an inter-procedural technique for computing symbolic bounds on the number of statements a procedure executes in terms of its scalar inputs and user-defined quantitative functions of input data-structures. Such computational complexity bounds for even simple programs are usually disjunctive, non-linear, and involve numerical properties of heaps. We address the challenges of generating these bounds using two novel ideas.
We introduce a proof methodology based on multiple counter instrumentation (each counter can be initialized and incremented at potentially multiple program locations) that allows a given linear invariant generation tool to compute linear bounds individually on these counter variables. The bounds on these counters are then composed together to generate total bounds that are non-linear and disjunctive. We also give an algorithm for automating this proof methodology. Our algorithm generates complexity bounds that are usually precise not only in terms of the computational complexity, but also in terms of the constant factors.
Next, we introduce the notion of user-defined quantitative functions that can be associated with abstract data-structures, e.g., length of a list, height of a tree, etc. We show how to compute bounds in terms of these quantitative functions using a linear invariant generation tool that has support for handling uninterpreted functions. We show application of this methodology to commonly used data-structures (namely lists, list of lists, trees, bit-vectors) using examples from Microsoft product code. We observe that a few quantitative functions for each data-structure are usually sufficient to allow generation of symbolic complexity bounds of a variety of loops that iterate over these data-structures, and that it is straightforward to define these quantitative functions.
The combination of these techniques enables generation of precise computational complexity bounds for real-world examples (drawn from Microsoft product code and C++ STL library code) for some of which it is non-trivial to even prove termination. Such automatically generated bounds are very useful for early detection of egregious performance problems in large modular codebases that are constantly being changed by multiple developers who make heavy use of code written by others without a good understanding of their implementation complexity.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design,Software
Cited by
161 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Robust Resource Bounds with Static Analysis and Bayesian Inference;Proceedings of the ACM on Programming Languages;2024-06-20
2. Quantitative Bounds on Resource Usage of Probabilistic Programs;Proceedings of the ACM on Programming Languages;2024-04-29
3. Towards Developing Effective Fault localization Technique for Termination Bugs in Loop Programs;Proceedings of the 5th ACM/IEEE International Workshop on Automated Program Repair;2024-04-20
4. Enhancing Performance Bug Prediction Using Performance Code Metrics;Proceedings of the 21st International Conference on Mining Software Repositories;2024-04-15
5. Modeling and Analyzing Evaluation Cost of CUDA Kernels;ACM Transactions on Parallel Computing;2024-03-12