Detection of phishing websites using a novel twofold ensemble model-Reference-Cited by-同舟云学术

Detection of phishing websites using a novel twofold ensemble model

Published:2018-10-18 Issue:3 Volume:20 Page:321-357
ISSN:1328-7265
Container-title:Journal of Systems and Information Technology
language:en
Short-container-title:JSIT

Author:

Nagaraj Kalyan,Bhattacharjee Biplab,Sridhar Amulyashree,GS Sharvani

Abstract

Purpose Phishing is one of the major threats affecting businesses worldwide in current times. Organizations and customers face the hazards arising out of phishing attacks because of anonymous access to vulnerable details. Such attacks often result in substantial financial losses. Thus, there is a need for effective intrusion detection techniques to identify and possibly nullify the effects of phishing. Classifying phishing and non-phishing web content is a critical task in information security protocols, and full-proof mechanisms have yet to be implemented in practice. The purpose of the current study is to present an ensemble machine learning model for classifying phishing websites. Design/methodology/approach A publicly available data set comprising 10,068 instances of phishing and legitimate websites was used to build the classifier model. Feature extraction was performed by deploying a group of methods, and relevant features extracted were used for building the model. A twofold ensemble learner was developed by integrating results from random forest (RF) classifier, fed into a feedforward neural network (NN). Performance of the ensemble classifier was validated using k-fold cross-validation. The twofold ensemble learner was implemented as a user-friendly, interactive decision support system for classifying websites as phishing or legitimate ones. Findings Experimental simulations were performed to access and compare the performance of the ensemble classifiers. The statistical tests estimated that RF_NN model gave superior performance with an accuracy of 93.41 per cent and minimal mean squared error of 0.000026. Research limitations/implications The research data set used in this study is publically available and easy to analyze. Comparative analysis with other real-time data sets of recent origin must be performed to ensure generalization of the model against various security breaches. Different variants of phishing threats must be detected rather than focusing particularly toward phishing website detection. Originality/value The twofold ensemble model is not applied for classification of phishing websites in any previous studies as per the knowledge of authors.

Publisher

Emerald

Subject

General Computer Science,Information Systems

Reference185 articles.

1. Multi-label rules for phishing classification;Applied Computing and Informatics,2015

2. Phishing detection based associative classification data mining;Expert Systems with Applications,2014

3. Phishing detection: a recent intelligent machine learning comparison based on models content and features,2017

4. A comparison of machine learning techniques for phishing detection,2007

5. Distributed phishing detection by applying variable selection using bayesian additive regression trees,2009

Cited by 15 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Ensemble Learning Approach for Phishing Website Detection Using an Optimal Greedy Stacking Model;Journal of The Institution of Engineers (India): Series B;2024-09-12

2. Bayesian-optimized extreme gradient boosting models for classification problems: an experimental analysis of product return case;Journal of Systems and Information Technology;2024-09-03

3. Unveiling suspicious phishing attacks: enhancing detection with an optimal feature vectorization algorithm and supervised machine learning;Frontiers in Computer Science;2024-07-02

4. Dataset of suspicious phishing URL detection;Frontiers in Computer Science;2024-03-06

5. Prediction of Phishing Sites in Network using Naive Bayes compared over Random Forest with improved Accuracy;2023 Eighth International Conference on Science Technology Engineering and Mathematics (ICONSTEM);2023-04-06