Affiliation:
1. State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China
Abstract
Greybox fuzzing is a powerful testing technique. Given a set of initial seeds, greybox fuzzing continuously generates new test inputs to execute the program under test and drives executions with code coverage as feedback. Seed prioritization is an important step of greybox fuzzing that helps greybox fuzzing choose promising seeds for input generation in priority. However, mainstream greybox fuzzers like AFL++ and Zest tend to neglect the importance of seed prioritization. They may pick seeds plainly according to the sequential order of the seeds being queued or an order produced with a random-based approach, which may consequently degrade their performance in exploring code and exposing bugs. In the meantime, existing state-of-the-art techniques like Alphuzz and K-Scheduler adopt complex strategies to schedule seeds. Although powerful, such strategies also inevitably incur great overhead and will reduce the scalability of the proposed technique.
In this paper, we propose a novel distance-based seed prioritization approach named
DiPri
to facilitate greybox fuzzing.
Specifically,
DiPri
evaluates the queued seeds according to seed distances and chooses the outlier ones, which are the farthest from the others, in priority to improve the probabilities of discovering previously unexplored code regions. To make a profound evaluation of
DiPri
, we prototype
DiPri
on AFL++ and conduct large-scale experiments with four baselines and 24 C/C++ fuzz targets, where eight are from widely adopted real-world projects, eight are from the coverage-based benchmark FuzzBench, and eight are from the bug-based benchmark Magma. The results obtained through a fuzzing exceeding 50,000 CPU hours suggest that
DiPri
can (1) insignificantly influence the host fuzzer’s capability of code coverage by slightly improving the branch coverage on the eight targets from real-world projects and slightly reducing the branch coverage on the eight targets from FuzzBench, and (2) improve the host fuzzer’s capability of finding bugs by triggering five more Magma bugs. Besides the evaluation with the three C/C++ benchmarks, we integrate
DiPri
into the Java fuzzer Zest and conduct experiments on a Java benchmark composed of five real-world programs for more than 8,000 CPU hours to empirically study the scalability of
DiPri
. The results with the Java benchmark demonstrate that
DiPri
is pretty scalable and can help the host fuzzer find bugs more consistently.
Publisher
Association for Computing Machinery (ACM)
Reference83 articles.
1. AFL. Accessed 2023-04-30. American Fuzzy Lop Github Repository. https://github.com/google/AFL.
2. AFL++Team. Accessed 2023-04-30. American Fuzzy Lop Plus Plus (afl++). https://github.com/AFLplusplus/AFLplusplus.
3. AFL++Team. Accessed 2023-05-07. Corpus minimization for American Fuzzy Lop. https://github.com/AFLplusplus/AFLplusplus/blob/stable/afl-cmin.
4. AFL++Team. Accessed 2023-11-16. AFL/AFL++ Test Cases. https://github.com/AFLplusplus/AFLplusplus/tree/stable/testcases.
5. Alphuzz-Team. Accessed December 2023. Alphuzz artifacts. https://github.com/zzyyrr/Alphuzz.