Phishing web site detection using diverse machine learning algorithms-Reference-Cited by-同舟云学术

Phishing web site detection using diverse machine learning algorithms

Published:2020-01-10 Issue:1 Volume:38 Page:65-80
ISSN:0264-0473
Container-title:The Electronic Library
language:en
Short-container-title:EL

Author:

Zamir Ammara,Khan Hikmat Ullah,Iqbal Tassawar,Yousaf Nazish,Aslam Farah,Anjum Almas,Hamdani Maryam

Abstract

Purpose This paper aims to present a framework to detect phishing websites using stacking model. Phishing is a type of fraud to access users’ credentials. The attackers access users’ personal and sensitive information for monetary purposes. Phishing affects diverse fields, such as e-commerce, online business, banking and digital marketing, and is ordinarily carried out by sending spam emails and developing identical websites resembling the original websites. As people surf the targeted website, the phishers hijack their personal information. Design/methodology/approach Features of phishing data set are analysed by using feature selection techniques including information gain, gain ratio, Relief-F and recursive feature elimination (RFE) for feature selection. Two features are proposed combining the strongest and weakest attributes. Principal component analysis with diverse machine learning algorithms including (random forest [RF], neural network [NN], bagging, support vector machine, Naïve Bayes and k-nearest neighbour) is applied on proposed and remaining features. Afterwards, two stacking models: Stacking1 (RF + NN + Bagging) and Stacking2 (kNN + RF + Bagging) are applied by combining highest scoring classifiers to improve the classification accuracy. Findings The proposed features played an important role in improving the accuracy of all the classifiers. The results show that RFE plays an important role to remove the least important feature from the data set. Furthermore, Stacking1 (RF + NN + Bagging) outperformed all other classifiers in terms of classification accuracy to detect phishing website with 97.4% accuracy. Originality/value This research is novel in this regard that no previous research focusses on using feed forward NN and ensemble learners for detecting phishing websites.

Publisher

Emerald

Subject

Library and Information Sciences,Computer Science Applications

Reference39 articles.

1. Automatic categorization of Arabic articles based on their political orientation;Digital Investigation,2018

2. Hybrid rule-based model for phishing URLs detection,2019

3. Social network security issues: Social engineering and phishing attacks,2012

4. Enhanced hidden Markov models for accelerating medical volumes segmentation,2011

5. A novel recommender system based on apriori algorithm for requirements engineering,2018

Cited by 91 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Advanced Detection of Abnormal ECG Patterns Using an Optimized LADTree Model with Enhanced Predictive Feature: Potential Application in CKD;Algorithms;2024-09-11

2. XAI-PhD: Fortifying Trust of Phishing URL Detection Empowered by Shapley Additive Explanations;International Journal of Online and Biomedical Engineering (iJOE);2024-08-08

3. Unveiling suspicious phishing attacks: enhancing detection with an optimal feature vectorization algorithm and supervised machine learning;Frontiers in Computer Science;2024-07-02

4. Utilizing Large Language Models with Human Feedback Integration for Generating Dedicated Warning for Phishing Emails;Proceedings of the 2nd ACM Workshop on Secure and Trustworthy Deep Learning Systems;2024-07-02

5. Beneath the Phishing Scripts: A Script-Level Analysis of Phishing Kits and Their Impact on Real-World Phishing Websites;Proceedings of the 19th ACM Asia Conference on Computer and Communications Security;2024-07