Affiliation:
1. CSIRO, Australia
2. CSIRO & Australian National University, Australia
3. Huawei, China
4. CSIRO & University of New South Wales, Australia
Abstract
Vulnerable third-party libraries pose significant threats to software applications that reuse these libraries. At an industry scale of reuse, manual analysis of third-party library vulnerabilities can be easily overwhelmed by the sheer number of vulnerabilities continually collected from diverse sources for thousands of reused libraries. Our study of four large-scale, actively maintained vulnerability databases (NVD, IBM X-Force, ExploitDB, and Openwall) reveals the wide presence of information discrepancies, in terms of seven vulnerability aspects, i.e., product, version, component, vulnerability type, root cause, attack vector, and impact, between the reports for the same vulnerability from heterogeneous sources. It would be beneficial to integrate and cross-validate multi-source vulnerability information, but it demands automatic aspect extraction and aspect discrepancy detection. In this work, we experimented with a wide range of NLP methods to extract named entities (e.g., product) and free-form phrases (e.g., root cause) from textual vulnerability reports and to detect semantically different aspect mentions between the reports. Our experiments confirm the feasibility of applying NLP methods to automate aspect-level vulnerability analysis and identify the need for domain customization of general NLP methods. Based on our findings, we propose a discrepancy-aware, aspect-level vulnerability knowledge graph and a KG-based web portal that integrates diversified vulnerability key aspect information from heterogeneous vulnerability databases. Our conducted user study proves the usefulness of our web portal. Our study opens the door to new types of vulnerability integration and management, such as vulnerability portraits of a product and explainable prediction of silent vulnerabilities.
Publisher
Association for Computing Machinery (ACM)
Reference85 articles.
1. Muhammad Abubakar, Adil Ahmad, Pedro Fonseca, and Dongyan Xu. 2021. SHARD: Fine-grained kernel specialization with context-aware hardening. In 30th USENIX Security Symposium (USENIX Security 21). USENIX Association, Vancouver, B.C.
2. Utilizing data from cvedetails.com, I created this graph to easily compare the amount of AMD and Intel vulnerabilities;https://www.reddit.com/r/Amd/comments/ek6m1q/utilizing_data_from_cvedetailscom_i_created_this/,2020
3. Afsah Anwar Ahmed Abusnaina Songqing Chen Frank Li and David Mohaisen. 2021. Cleaning the NVD: Compre-hensive quality assessment improvements and analyses. In 19th Transactions on Dependable and Secure Computing .
4. Apple Support. 2020. https://support.apple.com/en-us/HT209106. Accessed: 2020-12-31.
5. Priyam Biswas, Alessandro Di Federico, Scott A. Carr, Prabhu Rajasekaran, Stijn Volckaert, Yeoul Na, Michael Franz, and Mathias Payer. 2017. Venerable variadic vulnerabilities vanquished. In 26th \(\lbrace\) USENIX \(\rbrace\) Security Symposium ( \(\lbrace\) USENIX \(\rbrace\) Security 17). 186–198.