CyBERT: Cybersecurity Claim Classification by Fine-Tuning the BERT Language Model-Reference-Cited by-同舟云学术

CyBERT: Cybersecurity Claim Classification by Fine-Tuning the BERT Language Model

Published:2021-11-04 Issue:4 Volume:1 Page:615-637
ISSN:2624-800X
Container-title:Journal of Cybersecurity and Privacy
language:en
Short-container-title:JCP

Author:

Ameri Kimia^ORCID,Hempel Michael^ORCID,Sharif Hamid^ORCID,Lopez Jr. Juan^ORCID,Perumalla Kalyan^ORCID

Abstract

We introduce CyBERT, a cybersecurity feature claims classifier based on bidirectional encoder representations from transformers and a key component in our semi-automated cybersecurity vetting for industrial control systems (ICS). To train CyBERT, we created a corpus of labeled sequences from ICS device documentation collected across a wide range of vendors and devices. This corpus provides the foundation for fine-tuning BERT’s language model, including a prediction-guided relabeling process. We propose an approach to obtain optimal hyperparameters, including the learning rate, the number of dense layers, and their configuration, to increase the accuracy of our classifier. Fine-tuning all hyperparameters of the resulting model led to an increase in classification accuracy from 76% obtained with BertForSequenceClassification’s original architecture to 94.4% obtained with CyBERT. Furthermore, we evaluated CyBERT for the impact of randomness in the initialization, training, and data-sampling phases. CyBERT demonstrated a standard deviation of ±0.6% during validation across 100 random seed values. Finally, we also compared the performance of CyBERT to other well-established language models including GPT2, ULMFiT, and ELMo, as well as neural network models such as CNN, LSTM, and BiLSTM. The results showed that CyBERT outperforms these models on the validation accuracy and the F1 score, validating CyBERT’s robustness and accuracy as a cybersecurity feature claims classifier.

Funder

U.S. Department of Energy

Publisher

MDPI AG

Subject

General Medicine

Link

https://www.mdpi.com/2624-800X/1/4/31/pdf

Reference49 articles.

1. Universal language model fine-tuning for text classification;Howard;arXiv,2018

2. Deep contextualized word representations;Peters;arXiv,2018

3. Bert: Pre-training of deep bidirectional transformers for language understanding;Devlin;arXiv,2018

Cited by 26 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Enhanced Feature Representation for Multimodal Fake News Detection Using Localized Fine-Tuning of Improved BERT and VGG-19 Models;Arabian Journal for Science and Engineering;2024-08-07

2. A Protocol Agnostic Polymorphic Network Packet Transformer for 5G Malware Traffic Classification Using Deep Learning Models;2024 Joint European Conference on Networks and Communications & 6G Summit (EuCNC/6G Summit);2024-06-03

3. Few-Shot Log Anomaly Detection Based on Matching Networks;IEEE Transactions on Network and Service Management;2024-06

4. LLM potentiality and awareness: a position paper from the perspective of trustworthy and responsible AI modeling;Discover Artificial Intelligence;2024-05-21

5. A Review of Advancements and Applications of Pre-Trained Language Models in Cybersecurity;2024 12th International Symposium on Digital Forensics and Security (ISDFS);2024-04-29