Affiliation:
1. LRDSI Laboratory, Department of Computer Science, Faculty of Sciences, University of Blida 1, Blida, Algeria
Abstract
Phishing attacks are increasing every year, both in terms of number and technique. Using only human weaknesses, an attacker can easily obtain the victim’s credentials or access their network. The problem persists despite many approaches offered by researchers, due to its dynamic nature, in which new phishing tactics are created every time. We, therefore, need more robust and effective methods to detect phishing emails. In this paper, we aim to detect phishing emails using the body text of the email with the hybrid approach combining case-based reasoning (CBR) and a deep learning model. Our proposed model, called DL-CBR, consists of a Bidirectional Long Short-Term Memory (Bi-LSTM) + Temporal Convolutional Network (TCN) network with an attention mechanism followed by a CBR classifier. The deep learning model is used for email representation, where it is trained using the [Formula: see text]-pair loss function. To demonstrate the performance of DL-CBR, evaluation metrics, such as precision, accuracy, recall, and F-measure, were used, where we obtained an accuracy of 98.28%. The results show that our model outperformed other CBRs that utilize classical text representations like TF-IDF and Bag-of-Words. Additionally, while our model’s performance is slightly below that of the state-of-the-art models, it offers several advantages inherent to CBR. For instance, it can learn from new cases and update their database accordingly.
Publisher
World Scientific Pub Co Pte Ltd