Affiliation:
1. ETH Zurich, Zurich, Switzerland
Abstract
To optimize a program, a compiler needs precise information about it. Significant effort is dedicated to improving the ability of compilers to analyze programs, with the expectation that more information results in better optimization. But this assumption does not always hold: due to unexpected interactions between compiler components and phase ordering issues, sometimes more information leads to worse optimization. This can lead to wasted research and engineering effort whenever compilers cannot efficiently leverage additional information. In this work, we systematically examine the extent to which additional information can be detrimental to compilers. We consider two types of information: dead code, i.e., whether a program location is unreachable, and value ranges, i.e., the possible values a variable can take at a specific program location. Given a seed program, we refine it with additional information and check whether this degrades the output. Based on this approach, we develop a fully automated and effective testing method for identifying such issues, and through an extensive evaluation and analysis, we quantify their existence and prevalence in widely used compilers. In particular, we have reported 59 cases in GCC and LLVM, of which 55 have been confirmed or fixed so far, highlighting the practical relevance and value of our findings. This work’s fresh perspective opens up a new direction in understanding and improving compilers.
Publisher
Association for Computing Machinery (ACM)
Reference34 articles.
1. Alfred V. Aho, Monica S. Lam, Ravi Sethi, and Jeffrey D. Ullman. 2006. Compilers: Principles, Techniques, and Tools (2nd Edition). Addison-Wesley Longman Publishing Co., Inc., USA. isbn:0321486811
2. Compiler transformations for high-performance computing
3. Finding missed compiler optimizations by differential testing
4. A Survey of Compiler Testing
5. Metamorphic Testing