Affiliation:
1. Federal University of Minas Gerais, Brazil
2. Federal University of Ouro Preto, Brazil
Abstract
Incomplete source code naturally emerges in software development: during the design phase, while evolving, testing and analyzing programs. Therefore, the ability to understand partial programs is a valuable asset. However, this problem is still unsolved in the C programming language. Difficulties stem from the fact that parsing C requires, not only syntax, but also semantic information. Furthermore, inferring types so that they respect C's type system is a challenging task. In this paper we present a technique that lets us solve these problems. We provide a unification-based type inference capable of dealing with C intricacies. The ideas we present let us reconstruct partial C programs into complete well-typed ones. Such program reconstruction has several applications: enabling static analysis tools in scenarios where software components may be absent; improving static analysis tools that do not rely on build-specifications; allowing stub-generation and testing tools to work on snippets; and assisting programmers on the extraction of reusable data-structures out of the program parts that use them. Our evaluation is performed on source code from a variety of C libraries such as GNU's Coreutils, GNULib, GNOME's GLib, and GDSL; on implementations from Sedgewick's books; and on snippets from popular open-source projects like CPython, FreeBSD, and Git.
Publisher
Association for Computing Machinery (ACM)
Subject
Safety, Risk, Reliability and Quality,Software
Reference51 articles.
1. ANSI-Standard. 1989. ANSI X3.159-1989 - The C Programming Language. ANSI-Standard. 1989. ANSI X3.159-1989 - The C Programming Language.
2. The GDSL Authors. 2017. The Generic Data Structures Library. http://home.gna.org/gdsl/ . The GDSL Authors. 2017. The Generic Data Structures Library. http://home.gna.org/gdsl/ .
3. Overhauling SC atomics in C11 and OpenCL
4. Sniff (abstract)
5. Mechanized Semantics for the Clight Subset of the C Language
Cited by
11 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Partial program analysis for staged compilation systems;Formal Methods in System Design;2024-06-13
2. SLaDe: A Portable Small Language Model Decompiler for Optimized Assembly;2024 IEEE/ACM International Symposium on Code Generation and Optimization (CGO);2024-03-02
3. Statistical Type Inference for Incomplete Programs;Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering;2023-11-30
4. LExecutor: Learning-Guided Execution;Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering;2023-11-30
5. Program representations for predictive compilation: State of affairs in the early 20’s;Journal of Computer Languages;2022-12