T-Rec: Fine-Grained Language-Agnostic Program Reduction Guided by Lexical Syntax

Author:

Xu Zhenyang1ORCID,Tian Yongqiang2ORCID,Zhang Mengxiao1ORCID,Zhang Jiarui1ORCID,Liu Puzhuo3ORCID,Jiang Yu4ORCID,Sun Chengnian1ORCID

Affiliation:

1. University of Waterloo, Canada

2. The Hong Kong University of Science and Technology, China

3. Ant Group, China

4. Tsinghua University, China

Abstract

Program reduction strives to eliminate bug-irrelevant code elements from a bug-triggering program, so that (1) a smaller and more straightforward bug-triggering program can be obtained, (2) and the difference among duplicates ( i.e. , different programs that trigger the same bug) can be minimized or even eliminated. With such reduction and canonicalization functionality, program reduction facilitates debugging for software, especially language toolchains, such as compilers, interpreters, and debuggers. While many program reduction techniques have been proposed, most of them (especially the language-agnostic ones) overlooked the potential reduction opportunities hidden within tokens. Therefore, their capabilities in terms of reduction and canonicalization are significantly restricted. To fill this gap, we propose T-Rec, a fine-grained language-agnostic program reduction technique guided by lexical syntax. Instead of treating tokens as atomic and irreducible components, T-Rec introduces a fine-grained reduction process that leverages the lexical syntax of programming languages to effectively explore the reduction opportunities in tokens. Through comprehensive evaluations with versatile benchmark suites, we demonstrate that T-Rec significantly improves the reduction and canonicalization capability of two existing language-agnostic program reducers ( i.e. , Perses and Vulcan). T-Rec enables Perses and Vulcan to further eliminate 1,294 and 1,315 duplicates in a benchmark suite that contains 3,796 test cases that triggers 46 unique bugs. Additionally, T-Rec can also reduce up to 65.52% and 53.73% bytes in the results of Perses and Vulcan on our multi-lingual benchmark suite, respectively.

Publisher

Association for Computing Machinery (ACM)

Reference47 articles.

1. ANTLR. 2017. The ANTLR Parser Generator. Retrieved 2022-09-20 from https://www.antlr.org/

2. Cornelius Aschermann, Tommaso Frassetto, Thorsten Holz, Patrick Jauernig, Ahmad-Reza Sadeghi, and Daniel Teuchert. 2019. NAUTILUS: Fishing for Deep Bugs with Grammars. In 26th Annual Network and Distributed System Security Symposium, NDSS 2019, San Diego, California, USA, February 24-27, 2019. The Internet Society. https://www.ndss-symposium.org/ndss-paper/nautilus-fishing-for-deep-bugs-with-grammars/

3. Yang Chen, Alex Groce, Chaoqiang Zhang, Weng-Keen Wong, Xiaoli Fern, Eric Eide, and John Regehr. 2013. Taming Compiler Fuzzers. In Proceedings of the 2013 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). 197–208.

4. Nathan Chong, Alastair Donaldson, Andrei Lascu, and Christopher Lidbury. 2015. Many-Core Compiler Fuzzing. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI).

5. CPython. 2022. Bug Report. Retrieved 2022-09-20 from https://github.com/python/cpython/issues/new?assignees=&labels=type-bug&template=bug.md

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3