Coverage-guided tensor compiler fuzzing with joint IR-pass mutation-Reference-Cited by-同舟云学术

Coverage-guided tensor compiler fuzzing with joint IR-pass mutation

Published:2022-04-29 Issue:OOPSLA1 Volume:6 Page:1-26
ISSN:2475-1421
Container-title:Proceedings of the ACM on Programming Languages
language:en
Short-container-title:Proc. ACM Program. Lang.

Author:

Liu Jiawei¹^ORCID,Wei Yuxiang²^ORCID,Yang Sen³^ORCID,Deng Yinlin¹^ORCID,Zhang Lingming¹^ORCID

Affiliation:

1. University of Illinois at Urbana-Champaign, USA

2. Tongji University, China

3. Fudan University, China

Abstract

In the past decade, Deep Learning (DL) systems have been widely deployed in various application domains to facilitate our daily life, e.g., natural language processing, healthcare, activity recognition, and autonomous driving. Meanwhile, it is extremely challenging to ensure the correctness of DL systems (e.g., due to their intrinsic nondeterminism), and bugs in DL systems can cause serious consequences and may even threaten human lives. In the literature, researchers have explored various techniques to test, analyze, and verify DL models, since their quality directly affects the corresponding system behaviors. Recently, researchers have also proposed novel techniques for testing the underlying operator-level DL libraries (such as TensorFlow and PyTorch), which provide general binary implementations for each high-level DL operator and are the foundation for running DL models on different hardware platforms. However, there is still limited work targeting the reliability of the emerging tensor compilers (also known as DL compilers), which aim to automatically compile high-level tensor computation graphs directly into high-performance binaries for better efficiency, portability, and scalability than traditional operator-level libraries. Therefore, in this paper, we target the important problem of tensor compiler testing, and have proposed Tzer, a practical fuzzing technique for the widely used TVM tensor compiler. Tzer focuses on mutating the low-level Intermediate Representation (IR) for TVM due to the limited mutation space for the high-level IR. More specifically, Tzer leverages both general-purpose and tensor-compiler-specific mutators guided by coverage feedback for diverse and evolutionary IR mutation; furthermore, since tensor compilers provide various passes (i.e., transformations) for IR optimization, Tzer also performs pass mutation in tandem with IR mutation for more effective fuzzing. Our experimental results show that Tzer substantially outperforms existing fuzzing techniques on tensor compiler testing, with 75% higher coverage and 50% more valuable tests than the 2nd-best technique. Also, different components of Tzer have been validated via ablation study. To date, Tzer has detected 49 previously unknown bugs for TVM, with 37 bugs confirmed and 25 bugs fixed (PR merged).

Publisher

Association for Computing Machinery (ACM)

Subject

Safety, Risk, Reliability and Quality,Software

Link

https://dl.acm.org/doi/pdf/10.1145/3527317

Reference71 articles.

1. Martín Abadi , Paul Barham , Jianmin Chen , Zhifeng Chen , Andy Davis , Jeffrey Dean , Matthieu Devin , Sanjay Ghemawat , Geoffrey Irving , Michael Isard , Manjunath Kudlur , Josh Levenberg , Rajat Monga , Sherry Moore , Derek G. Murray , Benoit Steiner , Paul Tucker , Vijay Vasudevan , Pete Warden , Martin Wicke , Yuan Yu , and Xiaoqiang Zheng . 2016 . TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) . USENIX Association, Savannah, GA. 265–283. isbn:978-1-93 1971-33-1 https://www.usenix.org/conference/osdi16/technical-sessions/presentation/abadi Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). USENIX Association, Savannah, GA. 265–283. isbn:978-1-931971-33-1 https://www.usenix.org/conference/osdi16/technical-sessions/presentation/abadi

2. Cython: The Best of Both Worlds

3. Efficient Matrix Multiplication on SIMD Computers

4. Google Security Blog. 2016. Guided in-process fuzzing of Chrome components. https://security.googleblog.com/2016/08/guided-in-process-fuzzing-of-chrome.html Google Security Blog. 2016. Guided in-process fuzzing of Chrome components. https://security.googleblog.com/2016/08/guided-in-process-fuzzing-of-chrome.html

5. Boosting fuzzer efficiency: an information theoretic perspective

Cited by 19 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. The Havoc Paradox in Generator-Based Fuzzing (Registered Report);Proceedings of the 3rd ACM International Fuzzing Workshop;2024-09-13

2. Lightweight Code Coverage Analysis for Deep Learning Framework Testing;2024-08-29

3. History-Driven Fuzzing For Deep Learning Libraries;ACM Transactions on Software Engineering and Methodology;2024-08-16

4. Vulnerability detection through machine learning-based fuzzing: A systematic review;Computers & Security;2024-08

5. Fuzz4All: Universal Fuzzing with Large Language Models;Proceedings of the IEEE/ACM 46th International Conference on Software Engineering;2024-04-12