SAViP: Semantic-Aware Vulnerability Prediction for Binary Programs with Neural Networks-Reference-Cited by-同舟云学术

SAViP: Semantic-Aware Vulnerability Prediction for Binary Programs with Neural Networks

Published:2023-02-10 Issue:4 Volume:13 Page:2271
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Zhou Xu¹^ORCID,Duan Bingjie¹,Wu Xugang¹,Wang Pengfei¹

Affiliation:

1. College of Computer, National University of Defense Technology, Changsha 410073, China

Abstract

Vulnerability prediction, in which static analysis is leveraged to predict the vulnerabilities of binary programs, has become a popular research topic. Traditional vulnerability prediction methods depend on vulnerability patterns, which must be predefined by security experts in a time-consuming manner. The development of Artificial Intelligence (AI) has yielded new options for vulnerability prediction. Neural networks allow vulnerability patterns to be learned automatically. However, current works extract only one or two types of features and use traditional models such as word2vec, which results in the loss of much instruction-level information. In this paper, we propose a model named SAViP to predict vulnerabilities in binary programs. To fully extract binary information, we integrate three kinds of features: semantic, statistical, and structural features. For semantic features, we apply the Masked Language Model (MLM) pre-training task of the RoBERTa model to the assembly code to build our language model. Using this model, we innovatively combine the beginning token and the operation-code token to create the instruction embedding. For the statistical features, we design a 56-dimensional feature vector that contains 43 kinds of instructions. For the structural features, we improve the ability of the structure2vec network to obtain the characteristic of the network by emphasizing node self-attention. Through these optimizations, we significantly increase the accuracy of vulnerability prediction over existing methods. Our experiments show that SAViP achieves a recall of 77.85% and Top 100∼600 accuracies all above 95%. The results are 10% and 13% higher than those of the state-of-the-art V-Fuzz, respectively.

Funder

National University of Defense Technology Research Project

National Natural Science Foundation China

HUNAN Province Natural Science Foundation

National High-level Personnel for Defense Technology Program

National Key Research and Development Program of China

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/13/4/2271/pdf

Reference41 articles.

1. (2022, January 07). SonarQube. Available online: https://www.sonarqube.org/.

2. (2022, January 07). DeepScan. Available online: https://deepscan.io/.

3. (2022, January 07). Reshift Security. Available online: https://www.reshiftsecurity.com/.

4. Wang, S., Liu, T., and Tan, L. (2016, January 14–22). Automatically Learning Semantic Features for Defect Prediction. Proceedings of the 2016 IEEE/ACM 38th International Conference on Software Engineering, Austin, TX, USA.

5. Li, Z., Zou, D., Xu, S., Ou, X., Jin, H., Wang, S., Deng, Z., and Zhong, Y. (2018, January 18–21). VulDeePecker: A deep learning-based system for vulnerability detection. Proceedings of the 25th Network and Distributed System Security Symposium, San Diego, CA, USA.