CCrFS: Combine Correlation Features Selection for Detecting Phishing Websites Using Machine Learning

Author:

Moedjahedy JimmyORCID,Setyanto AriefORCID,Alarfaj Fawaz KhaledORCID,Alreshoodi MohammedORCID

Abstract

Internet users are continually exposed to phishing as cybercrime in the 21st century. The objective of phishing is to obtain sensitive information by deceiving a target and using the information for financial gain. The information may include a login detail, password, date of birth, credit card number, bank account number, and family-related information. To acquire these details, users will be directed to fill out the information on false websites based on information from emails, adverts, text messages, or website pop-ups. Examining the website’s URL address is one method for avoiding this type of deception. Identifying the features of a phishing website URL takes specialized knowledge and investigation. Machine learning is one method that uses existing data to teach machines to distinguish between legal and phishing website URLs. In this work, we proposed a method that combines correlation and recursive feature elimination to determine which URL characteristics are useful for identifying phishing websites by gradually decreasing the number of features while maintaining accuracy value. In this paper, we use two datasets that contain 48 and 87 features. The first scenario combines power predictive score correlation and recursive feature elimination; the second scenario is the maximal information coefficient correlation and recursive feature elimination. The third scenario combines spearman correlation and recursive feature elimination. All three scenarios from the combined findings of the proposed methodologies achieve a high level of accuracy even with the smallest feature subset. For dataset 1, the accuracy value for the 10 features result is 97.06%, and for dataset 2 the accuracy value is 95.88% for 10 features.

Publisher

MDPI AG

Subject

Computer Networks and Communications

Cited by 13 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. An Explainable Feature Selection Framework for Web Phishing Detection with Machine Learning;Data Science and Management;2024-08

2. Phishing URL Detection Using Machine Learning and Deep Learning;2024 IEEE World AI IoT Congress (AIIoT);2024-05-29

3. An Improved Webpage Phishing Detection Based on an Ensemble Feature Selection and Multilevel Techniques;2024 International Conference on Science, Engineering and Business for Driving Sustainable Development Goals (SEB4SDG);2024-04-02

4. A Comprehensive Survey: Exploring Current Trends and Challenges in Intrusion Detection and Prevention Systems in the Cloud Computing Paradigm;2024 2nd International Conference on Intelligent Data Communication Technologies and Internet of Things (IDCIoT);2024-01-04

5. Detection of Phishing Domain Using Logistic Regression Technique and Feature Extraction Using BERT Classification Model;2023 3rd International Conference on Smart Generation Computing, Communication and Networking (SMART GENCON);2023-12-29

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3