Affiliation:
1. Capitol Technology University, USA
Abstract
This study explores the efficacy of the bidirectional encoder representations from transformers (BERT) model in the domain of Android malware detection, comparing its performance against traditional machine learning models such as convolutional neural networks (CNNs) and long short-term memory (LSTMs). Employing a comprehensive methodology, the research utilizes two significant datasets, the Drebin dataset and the CIC AndMal2017 dataset, known for their extensive collection of Android malware and benign applications. The models are evaluated based on accuracy, precision, recall, and F1 score. Additionally, the study addresses the challenge of concept drift in malware detection by incorporating active learning techniques to adapt to evolving malware patterns. The results indicate that BERT outperforms traditional models, demonstrating higher accuracy and adaptability, primarily due to its advanced natural language processing capabilities. This study contributes to the field of cybersecurity and NLP.
Reference94 articles.
1. Efficient Security and Privacy of Lossless Secure Communication for Sensor-based Urban Cities
2. AgboolaO. (2022). Spam Detection Using Machine Learning and Deep Learning. LSU Doctoral Dissertations.
3. Millimeter-wave channel modeling in a VANETs using coding techniques
4. Al KinoonM.OmarM.MohaisenM.MohaisenD. (2021). Security breaches in the healthcare domain: a spatiotemporal analysis. Springer International Publishing.
5. Are your training datasets yet relevant? An investigation into the importance of timeline in machine learning-based malware detection. Engineering Secure Software and Systems: 7th International Symposium, ESSoS 2015, Milan, Italy, March 4-6, 2015;K.Allix;Proceedings,2015