Affiliation:
1. IMDEA Software Institute, Madrid, Spain
2. University of Texas at Dallas, TX, USA
Abstract
In many applications, source code and debugging symbols of a target program are not available, and the only thing that we can access is the program executable. A fundamental challenge with executables is that, during compilation, critical information such as variables and types is lost. Given that typed variables provide fundamental semantics of a program, for the last 16 years, a large amount of research has been carried out on binary code type inference, a challenging task that aims to infer typed variables from executables (also referred to as binary code). In this article, we systematize the area of binary code type inference according to its most important dimensions: the applications that motivate its importance, the approaches used, the types that those approaches infer, the implementation of those approaches, and how the inference results are evaluated. We also discuss limitations, underdeveloped problems and open challenges, and propose further applications.
Funder
NSF
Regional Government of Madrid
AFOSR
Spanish Government
Publisher
Association for Computing Machinery (ACM)
Subject
General Computer Science,Theoretical Computer Science
Cited by
38 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Multi-modal Learning for WebAssembly Reverse Engineering;Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis;2024-09-11
2. Integrating Flow and Program Analysis for Enhanced Protocol Reverse Engineering;2023 20th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP);2023-12-15
3. Operand-Variation-Oriented Differential Analysis for Fuzzing Binding Calls in PDF Readers;2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE);2023-05
4. Extending Source Code Pre-Trained Language Models to Summarise Decompiled Binaries;2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER);2023-03
5. FastKLEE: faster symbolic execution via reducing redundant bound checking of type-safe pointers;Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering;2022-11-07