Affiliation:
1. Department of Computer Engineering, Hacettepe University, 06800 Ankara, Türkiye
2. Department of Computer Engineering, Eskisehir Osmangazi University, 26040 Eskisehir, Türkiye
Abstract
In phishing attack detection, machine learning-based approaches are more effective than simple blacklisting strategies, as they can adapt to new types of attacks and do not require manual updates. However, for these approaches, the choice of features and classifiers directly influences detection performance. Therefore, in this work, the contributions of various features and classifiers to detecting phishing attacks were thoroughly analyzed to find the best classifier and feature set in terms of different performance metrics including accuracy, precision, recall, F1-score, and classification time. For this purpose, a brand-new phishing dataset was prepared and made publicly available. Using an exhaustive strategy, every combination of the feature groups was fed into various classifiers to detect phishing websites. Two existing benchmark datasets were also used in addition to ours for further analysis. The experimental results revealed that the features based on the uniform resource locator (URL) and hypertext transfer protocol (HTTP), rather than all features, offered the best performance. Also, the decision tree classifier surpassed the others, achieving an F1-score of 0.99 and being one of the fastest classifiers overall.
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference46 articles.
1. A survey of intelligent detection designs of HTML URL phishing attacks;Asiri;IEEE Access,2023
2. (2023, October 10). APWG Anti-Phishing Working Group. Available online: https://apwg.org.
3. (2023, October 10). APWG Phishing Activity Trends Report Q3. Available online: https://apwg.org/trendsreports.
4. PHISHGEM: A mobile game-based learning for phishing awareness;Tinubu;J. Cyber Secur. Technol.,2023
5. Bishop, C.M. (2006). Pattern Recognition and Machine Learning, Springer.