VTT-LLM: Advancing Vulnerability-to-Tactic-and-Technique Mapping through Fine-Tuning of Large Language Model-Reference-Cited by-同舟云学术

VTT-LLM: Advancing Vulnerability-to-Tactic-and-Technique Mapping through Fine-Tuning of Large Language Model

Published:2024-04-24 Issue:9 Volume:12 Page:1286
ISSN:2227-7390
Container-title:Mathematics
language:en
Short-container-title:Mathematics

Author:

Zhang Chenhui¹,Wang Le¹²^ORCID,Fan Dunqiu³,Zhu Junyi¹,Zhou Tang¹,Zeng Liyi²^ORCID,Li Zhaohua⁴^ORCID

Affiliation:

1. Cyberspace Institute of Advanced Technology, Guangzhou University, Guangzhou 510006, China

2. Peng Cheng Laboratory, Shenzhen 518000, China

3. NSFOCUS Inc., Guangzhou 510006, China

4. Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, Shenzhen 518110, China

Abstract

Vulnerabilities are often accompanied by cyberattacks. CVE is the largest repository of open vulnerabilities, which keeps expanding. ATT&CK models known multi-step attacks both tactically and technically and remains up to date. It is valuable to correlate the vulnerability in CVE with the corresponding tactic and technique of ATT&CK which exploit the vulnerability, for active defense. Mappings manually is not only time-consuming but also difficult to keep up-to-date. Existing language-based automated mapping methods do not utilize the information associated with attack behaviors outside of CVE and ATT&CK and are therefore ineffective. In this paper, we propose a novel framework named VTT-LLM for mapping Vulnerabilities to Tactics and Techniques based on Large Language Models, which consists of a generation model and a mapping model. In order to generate fine-tuning instructions for LLM, we create a template to extract knowledge of CWE (a standardized list of common weaknesses) and CAPEC (a standardized list of common attack patterns). We train the generation model of VTT-LLM by fine-tuning the LLM according to the above instructions. The generation model correlates vulnerability and attack through their descriptions. The mapping model transforms the descriptions of ATT&CK tactics and techniques into vectors through text embedding and further associates them with attacks through semantic matching. By leveraging the knowledge of CWE and CAPEC, VTT-LLM can eventually automate the process of linking vulnerabilities in CVE to the attack techniques and tactics of ATT&CK. Experiments on the latest public dataset, ChatGPT-VDMEval, show the effectiveness of VTT-LLM with an accuracy of 85.18%, which is 13.69% and 54.42% higher than the existing CVET and ChatGPT-based methods, respectively. In addition, compared to fine-tuning without outside knowledge, the accuracy of VTT-LLM with chain fine-tuning is 9.24% higher on average across different LLMs.

Funder

Guangdong Basic and Applied Basic Research Foundation

Guangdong High-level University Foundation Program

Major Key Project of PCL

National Natural Science Foundation of China

Publisher

MDPI AG

Link

https://www.mdpi.com/2227-7390/12/9/1286/pdf

Reference30 articles.

1. CVE (2023, August 15). Common Vulnerabilities and Exposures. Available online: https://www.cve.org/.

2. Strom, B.E., Applebaum, A., Miller, D.P., Nickels, K.C., Pennington, A.G., and Thomas, C.B. (2018). Mitre att&ck: Design and philosophy, In Technical Report; The MITRE Corporation.

3. STG2P: A two-stage pipeline model for intrusion detection based on improved LightGBM and K-means;Zhang;Simul. Model. Pract. Theory,2022

4. Kaloroumakis, P.E., and Smith, M.J. (2021). Toward a Knowledge Graph of Cybersecurity Countermeasures, The MITRE Corporation.

5. Proceedings of the 2021 ACM Conference Knowledge Discovery and Data Mining (KDD’21) Workshop on AI-enabled Cybersecurity Analytics.