Understanding and Finding Java Decompiler Bugs

Author:

Lu Yifei1ORCID,Hou Weidong1ORCID,Pan Minxue1ORCID,Li Xuandong1ORCID,Su Zhendong2ORCID

Affiliation:

1. Nanjing University, Nanjing, China

2. ETH Zurich, Zurich, Switzerland

Abstract

Java decompilers are programs that perform the reverse process of Java compilers, i.e., they translate Java bytecode to Java source code. They are essential for reverse engineering purposes and have become more sophisticated and reliable over the years. However, it remains challenging for modern Java decompilers to reliably perform correct decompilation on real-world programs. To shed light on the key challenges of Java decompilation, this paper provides the first systematic study on the characteristics and causes of bugs in mature, widely-used Java decompilers. We conduct the study by investigating 333 unique bugs from three popular Java decompilers. Our key findings and observations include: (1) Although most of the reported bugs were found when decompiling large, real-world code, 40.2% of them have small test cases for bug reproduction; (2) Over 80% of the bugs manifest as exceptions, syntactic errors, or semantic errors, and bugs with source code artifacts are very likely semantic errors; (3) 57.7%, 39.0%, and 41.1% of the bugs respectively are attributed to three stages of decompilers—loading structure entities from bytecode, optimizing these entities, and generating source code from these entities; (4) Bugs in decompilers’ type inference are the most complex to fix; and (5) Region restoration for structures like loop, sugaring for special structures like switch, and type inference of variables of generic types or indistinguishable types are the three most significant challenges in Java decompilation, which to some extent explains our findings in (3) and (4). Based on these findings, we present JD-Tester, a differential testing framework for Java decompilers, and our experience of using it in testing the three popular Java decompilers. JD-Testerutilizes different Java program generators to construct executable Java tests and finds exceptions, syntactic, and semantic inconsistencies (i.e. bugs) between a generated test and its compiled-decompiled version (through compilation and execution). In total, we have found 62 bugs in the three decompilers, demonstrating both the effectiveness of JD-Tester, and the importance of testing and validating Java decompilers.

Publisher

Association for Computing Machinery (ACM)

Reference41 articles.

1. Unstructured data collection from APK files for malware detection;Agrawal Prerna;International Journal of Computer Applications,2020

2. An Intelligent Behavior-Based Ransomware Detection System For Android Platform

3. Control flow obfuscation for Android applications

4. Confuzzion: A Java Virtual Machine Fuzzer for Type Confusion Vulnerabilities

5. David Brumley, JongHyup Lee, Edward J Schwartz, and Maverick Woo. 2013. Native x86 decompilation using $semantics-preserving$ structural analysis and iterative $control-flow$ structuring. In 22nd USENIX Security Symposium (USENIX Security 13). 353–368.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3