Intelligent Deep Machine Learning Cyber Phishing URL Detection Based on BERT Features Extraction-Reference-Cited by-同舟云学术

Intelligent Deep Machine Learning Cyber Phishing URL Detection Based on BERT Features Extraction

Published:2022-11-08 Issue:22 Volume:11 Page:3647
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Elsadig Muna,Ibrahim Ashraf Osman^ORCID,Basheer Shakila^ORCID,Alohali Manal Abdullah,Alshunaifi Sara,Alqahtani Haya,Alharbi Nihal,Nagmeldin Wamda^ORCID

Abstract

Recently, phishing attacks have been a crucial threat to cyberspace security. Phishing is a form of fraud that attracts people and businesses to access malicious uniform resource locators (URLs) and submit their sensitive information such as passwords, credit card ids, and personal information. Enormous intelligent attacks are launched dynamically with the aim of tricking users into thinking they are accessing a reliable website or online application to acquire account information. Researchers in cyberspace are motivated to create intelligent models and offer secure services on the web as phishing grows more intelligent and malicious every day. In this paper, a novel URL phishing detection technique based on BERT feature extraction and a deep learning method is introduced. BERT was used to extract the URLs’ text from the Phishing Site Predict dataset. Then, the natural language processing (NLP) algorithm was applied to the unique data column and extracted a huge number of useful data features in terms of meaningful text information. Next, a deep convolutional neural network method was utilised to detect phishing URLs. It was used to constitute words or n-grams in order to extract higher-level features. Then, the data were classified into legitimate and phishing URLs. To evaluate the proposed method, a famous public phishing website URLs dataset was used, with a total of 549,346 entries. However, three scenarios were developed to compare the outcomes of the proposed method by using similar datasets. The feature extraction process depends on natural language processing techniques. The experiments showed that the proposed method had achieved 96.66% accuracy in the results, and then the obtained results were compared to other literature review works. The results showed that the proposed method was efficient and valid in detecting phishing websites’ URLs.

Funder

Princess Nourah bint Abdulrahman University

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering

Link

https://www.mdpi.com/2079-9292/11/22/3647/pdf

Reference43 articles.

1. Fighting against phishing attacks: State of the art and future challenges;Gupta;Neural Comput. Appl.,2017

2. Impact of COVID-19 on consumer buying behavior toward online shopping in Iraq;Ali;Econ. Stud. J.,2020

3. Huang, Y., Qin, J., and Wen, W. Phishing URL detection via capsule-based neural network. Proceedings of the 2019 IEEE 13th International Conference on Anti-Counterfeiting, Security, and Identification (ASID).

4. Social engineering attacks during the COVID-19 pandemic;Venkatesha;SN Comput. Sci.,2021

5. Available online: https://www.statista.com/statistics/420442/organizations-most-affected-byphishing/. 2022.

Cited by 14 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Filter-Based Feature Selection for Robust Phishing Attack Detection using XGBoost;International Journal of Advanced Research in Science, Communication and Technology;2024-08-17

2. Adaptive weighted feature fusion for multiscale atrous convolution‐based 1DCNN with dilated LSTM‐aided fake news detection using regional language text information;Expert Systems;2024-07-04

3. A Review of Advancements and Applications of Pre-Trained Language Models in Cybersecurity;2024 12th International Symposium on Digital Forensics and Security (ISDFS);2024-04-29

4. A deep learning mechanism to detect phishing URLs using the permutation importance method and SMOTE-Tomek link;The Journal of Supercomputing;2024-04-23

5. The applicability of a hybrid framework for automated phishing detection;Computers & Security;2024-04