Author:
Ban Gu,Xu Lili,Xiao Yang,Li Xinhua,Yuan Zimu,Huo Wei
Abstract
AbstractCodes of Open Source Software (OSS) are widely reused during software development nowadays. However, reusing some specific versions of OSS introduces 1-day vulnerabilities of which details are publicly available, which may be exploited and lead to serious security issues. Existing state-of-the-art OSS reuse detection work can not identify the specific versions of reused OSS well. The features they selected are not distinguishable enough for version detection and the matching scores are only based on similarity.This paper presents B2SMatcher, a fine-grained version identification tool for OSS in commercial off-the-shelf (COTS) software. We first discuss five kinds of version-sensitive code features that are trackable in both binary and source code. We categorize these features into program-level features and function-level features and propose a two-stage version identification approach based on the two levels of code features. B2SMatcher also identifies different types of OSS version reuse based on matching scores and matched feature instances. In order to extract source code features as accurately as possible, B2SMatcher innovatively uses machine learning methods to obtain the source files involved in the compilation and uses function abstraction and normalization methods to eliminate the comparison costs on redundant functions across versions. We have evaluated B2SMatcher using 6351 candidate OSS versions and 585 binaries. The result shows that B2SMatcher achieves a high precision up to 89.2% and outperforms state-of-the-art tools. Finally, we show how B2SMatcher can be used to evaluate real-world software and find some security risks in practice.
Funder
Innovative Research Group Project of the National Natural Science Foundation of China
National Natural Science Foundation of China
Publisher
Springer Science and Business Media LLC
Subject
Artificial Intelligence,Computer Networks and Communications,Information Systems,Software
Reference50 articles.
1. 2020 Open Source Security and Risk Analysis Report (2020). https://www.synopsys.com/zh-cn/software-integrity/resources/reports/2020-open-source-security-risk-analysis.html. Accessed 10 Apr 2021.
2. Cadariu, M, Bouwers E, Visser J, van Deursen A (2015) Tracking known security vulnerabilities in proprietary software systems In: Proceedings of 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), 516–519.. Software Analysis, Evolution, and Reengineering, New York.
3. Chandramohan, M, Xue Y, Xu Z, Liu Y, Cho CY, Tan HBK (2016) Bingo: Cross-architecture cross-os binary search In: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 678–689.. USENIX Association, Kyoto.
4. CVEDetails (2020) Free CVE security vulnerblity database source. https://www.cvedetails.com/. Accessed 10 Apr 2021.
5. Cybellum (2020) Uncover the Software Components Inside Your Vehicles and Identify All Vulnerabilities. https://cybellum.com/. Accessed 10 Apr 2021.
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献