Affiliation:
1. Cybersecurity Laboratory, BV TECH S.p.A., 20123 Milan, Italy
Abstract
Web phishing is a form of cybercrime aimed at tricking people into visiting malicious URLs to exfiltrate sensitive data. Since the structure of a malicious URL evolves over time, phishing detection mechanisms that can adapt to such variations are paramount. Furthermore, web phishing detection is an unbalanced classification task, as legitimate URLs outnumber malicious ones in real-life cases. Deep learning (DL) has emerged as a promising technique to minimize concept drift to enhance web phishing detection. Deep reinforcement learning (DRL) combines DL with reinforcement learning (RL); that is, a sequential decision-making paradigm in which the problem to be addressed is expressed as a Markov decision process (MDP). Recent studies have proposed an ad hoc MDP formulation to tackle unbalanced classification tasks called the imbalanced classification Markov decision process (ICMDP). In this paper, we exploit the ICMDP to present a double deep Q-Network (DDQN)-based classifier to address the unbalanced web phishing classification problem. The proposed algorithm is evaluated on a Mendeley web phishing dataset, from which three different data imbalance scenarios are generated. Despite a significant training time, it results in better geometric mean, index of balanced accuracy, F1 score, and area under the ROC curve than other DL-based classifiers combined with data-level sampling techniques in all test cases.
Subject
Computer Networks and Communications,Human-Computer Interaction
Reference71 articles.
1. Learning under Concept Drift: A Review;Lu;IEEE Trans. Knowl. Data Eng.,2019
2. Thampi, S.M., Piramuthu, S., Li, K.C., Berretti, S., Wozniak, M., and Singh, D. (2020, January 14–17). Concept Drift Detection in Phishing Using Autoencoders. Proceedings of the Machine Learning and Metaheuristics Algorithms, and Applications (SoMMA), Chennai, India.
3. Raza, M., Jayasinghe, N.D., and Muslam, M.M.A. (2021, January 13–16). A Comprehensive Review on Email Spam Classification using Machine Learning Algorithms. Proceedings of the 2021 International Conference on Information Networking (ICOIN), Jeju, Republic of Korea.
4. Deep Reinforcement Learning: A Brief Survey;Arulkumaran;IEEE Signal Process. Mag.,2017
5. Wang, X., Wang, S., Liang, X., Zhao, D., Huang, J., Xu, X., Dai, B., and Miao, Q. (2022). Deep Reinforcement Learning: A Survey. IEEE Trans. Neural Netw. Learn. Syst., in press.
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献