Boosting Fuzzer Efficiency: An Information Theoretic Perspective-Reference-Cited by-同舟云学术

Boosting Fuzzer Efficiency: An Information Theoretic Perspective

Published:2023-10-20 Issue:11 Volume:66 Page:89-97
ISSN:0001-0782
Container-title:Communications of the ACM
language:en
Short-container-title:Commun. ACM

Author:

Böhme Marcel¹,Manès Valentin J. M.²,Cha Sang Kil²

Affiliation:

1. MPI-SP, Germany; Monash University, Australia

2. CSRC, KAIST, Korea

Abstract

In this paper, we take the fundamental perspective of fuzzing as a learning process. Suppose before fuzzing, we know nothing about the behaviors of a program P : What does it do? Executing the first test input, we learn how P behaves for this input. Executing the next input, we either observe the same or discover a new behavior. As such, each execution reveals "some amount" of information about P 's behaviors. A classic measure of information is Shannon's entropy. Measuring entropy allows us to quantify how much is learned from each generated test input about the behaviors of the program. Within a probabilistic model of fuzzing, we show how entropy also measures fuzzer efficiency. Specifically, it measures the general rate at which the fuzzer discovers new behaviors. Intuitively, efficient fuzzers maximize information. From this information theoretic perspective, we develop ENTROPIC, an entropy-based power schedule for greybox fuzzing that assigns more energy to seeds that maximize information. We implemented ENTROPIC into the popular greybox fuzzer LIBFUZZER. Our experiments with more than 250 open-source programs (60 million LoC) demonstrate a substantially improved efficiency and confirm our hypothesis that an efficient fuzzer maximizes information. ENTROPIC has been independently evaluated and integrated into the main-line LIBFUZZER as the default power schedule. ENTROPIC now runs on more than 25,000 machines fuzzing hundreds of security-critical software systems simultaneously and continuously.

Funder

Australian Research Council

Australian Research Data Commons

Ministry of Science and ICT, South Korea

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/3611019

Reference24 articles.

1. Coverage and fault detection of the output-uniqueness test selection criteria

2. A practical guide for using statistical tests to assess randomized algorithms in software engineering

3. STADS

4. Böhme , M. , Falk , B. Fuzzing : On the exponential cost of vulnerability discovery . In Proceedings of the 14th Joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE) ( 2020 ), 1--12. Böhme, M., Falk, B. Fuzzing: On the exponential cost of vulnerability discovery. In Proceedings of the 14th Joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE) (2020), 1--12.

5. Estimating residual risk in greybox fuzzing

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. LinFuzz: Program-Sensitive Seed Scheduling Greybox Fuzzing Based on LinUCB Algorithm;IEEE Access;2024