<scp>DiPri</scp> : Distance-based Seed Prioritization for Greybox Fuzzing-Reference-Cited by-同舟云学术

DiPri : Distance-based Seed Prioritization for Greybox Fuzzing

Published:2024-03-26 Issue: Volume: Page:
ISSN:1049-331X
Container-title:ACM Transactions on Software Engineering and Methodology
language:en
Short-container-title:ACM Trans. Softw. Eng. Methodol.

Author:

Qian Ruixiang¹,Zhang Quanjun¹,Fang Chunrong¹,Yang Ding¹,Li Shun¹,Li Binyu¹,Chen Zhenyu¹

Affiliation:

1. State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China

Abstract

Greybox fuzzing is a powerful testing technique. Given a set of initial seeds, greybox fuzzing continuously generates new test inputs to execute the program under test and drives executions with code coverage as feedback. Seed prioritization is an important step of greybox fuzzing that helps greybox fuzzing choose promising seeds for input generation in priority. However, mainstream greybox fuzzers like AFL++ and Zest tend to neglect the importance of seed prioritization. They may pick seeds plainly according to the sequential order of the seeds being queued or an order produced with a random-based approach, which may consequently degrade their performance in exploring code and exposing bugs. In the meantime, existing state-of-the-art techniques like Alphuzz and K-Scheduler adopt complex strategies to schedule seeds. Although powerful, such strategies also inevitably incur great overhead and will reduce the scalability of the proposed technique. In this paper, we propose a novel distance-based seed prioritization approach named DiPri to facilitate greybox fuzzing. Specifically, DiPri evaluates the queued seeds according to seed distances and chooses the outlier ones, which are the farthest from the others, in priority to improve the probabilities of discovering previously unexplored code regions. To make a profound evaluation of DiPri , we prototype DiPri on AFL++ and conduct large-scale experiments with four baselines and 24 C/C++ fuzz targets, where eight are from widely adopted real-world projects, eight are from the coverage-based benchmark FuzzBench, and eight are from the bug-based benchmark Magma. The results obtained through a fuzzing exceeding 50,000 CPU hours suggest that DiPri can (1) insignificantly influence the host fuzzer’s capability of code coverage by slightly improving the branch coverage on the eight targets from real-world projects and slightly reducing the branch coverage on the eight targets from FuzzBench, and (2) improve the host fuzzer’s capability of finding bugs by triggering five more Magma bugs. Besides the evaluation with the three C/C++ benchmarks, we integrate DiPri into the Java fuzzer Zest and conduct experiments on a Java benchmark composed of five real-world programs for more than 8,000 CPU hours to empirically study the scalability of DiPri . The results with the Java benchmark demonstrate that DiPri is pretty scalable and can help the host fuzzer find bugs more consistently.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3654440

Reference83 articles.

1. AFL. Accessed 2023-04-30. American Fuzzy Lop Github Repository. https://github.com/google/AFL.

2. AFL++Team. Accessed 2023-04-30. American Fuzzy Lop Plus Plus (afl++). https://github.com/AFLplusplus/AFLplusplus.

3. AFL++Team. Accessed 2023-05-07. Corpus minimization for American Fuzzy Lop. https://github.com/AFLplusplus/AFLplusplus/blob/stable/afl-cmin.

4. AFL++Team. Accessed 2023-11-16. AFL/AFL++ Test Cases. https://github.com/AFLplusplus/AFLplusplus/tree/stable/testcases.

5. Alphuzz-Team. Accessed December 2023. Alphuzz artifacts. https://github.com/zzyyrr/Alphuzz.