Exploitation of Vulnerabilities: A Topic-Based Machine Learning Framework for Explaining and Predicting Exploitation
-
Published:2023-07-14
Issue:7
Volume:14
Page:403
-
ISSN:2078-2489
-
Container-title:Information
-
language:en
-
Short-container-title:Information
Author:
Charmanas Konstantinos1ORCID, Mittas Nikolaos2ORCID, Angelis Lefteris1
Affiliation:
1. School of Informatics, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece 2. Department of Chemistry, International Hellenic University, 65404 Kavala, Greece
Abstract
Security vulnerabilities constitute one of the most important weaknesses of hardware and software security that can cause severe damage to systems, applications, and users. As a result, software vendors should prioritize the most dangerous and impactful security vulnerabilities by developing appropriate countermeasures. As we acknowledge the importance of vulnerability prioritization, in the present study, we propose a framework that maps newly disclosed vulnerabilities with topic distributions, via word clustering, and further predicts whether this new entry will be associated with a potential exploit Proof Of Concept (POC). We also provide insights on the current most exploitable weaknesses and products through a Generalized Linear Model (GLM) that links the topic memberships of vulnerabilities with exploit indicators, thus distinguishing five topics that are associated with relatively frequent recent exploits. Our experiments show that the proposed framework can outperform two baseline topic modeling algorithms in terms of topic coherence by improving LDA models by up to 55%. In terms of classification performance, the conducted experiments—on a quite balanced dataset (57% negative observations, 43% positive observations)—indicate that the vulnerability descriptions can be used as exclusive features in assessing the exploitability of vulnerabilities, as the “best” model achieves accuracy close to 87%. Overall, our study contributes to enabling the prioritization of vulnerabilities by providing guidelines on the relations between the textual details of a weakness and the potential application/system exploits.
Subject
Information Systems
Reference71 articles.
1. Nayak, K., Marino, D., Efstathopoulos, P., and Dumitraş, T. (2014, January 17–19). Some vulnerabilities are different than others. Proceedings of the International Workshop on Recent Advances in Intrusion Detection, Gothenburg, Sweden. 2. A multi-target approach to estimate software vulnerability characteristics and severity scores;Spanos;J. Syst. Softw.,2018 3. Bullough, B.L., Yanchenko, A.K., Smith, C.L., and Zipkin, J.R. (2017, January 24). Predicting exploitation of disclosed software vulnerabilities using open-source data. Proceedings of the 3rd ACM on International Workshop on Security and Privacy Analytics, Scottsdale, AZ, USA. 4. Tavabi, N., Goyal, P., Almukaynizi, M., Shakarian, P., and Lerman, K. (2018, January 2–7). Darkembed: Exploit prediction with neural language models. Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA. 5. Almukaynizi, M., Nunes, E., Dharaiya, K., Senguttuvan, M., Shakarian, J., and Shakarian, P. (2017, January 7–8). Proactive identification of exploits in the wild through vulnerability mentions online. Proceedings of the 2017 International Conference on Cyber Conflict (CyCon US), Washington, DC, USA.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|