Efficient E-Mail Spam Detection Strategy Using Genetic Decision Tree Processing with NLP Features

Author:

Ismail Safaa S. I.1ORCID,Mansour Romany F.1ORCID,Abd El-Aziz Rasha M.2ORCID,Taloba Ahmed I.3ORCID

Affiliation:

1. Department of Mathematics, Faculty of Science, New Valley University, El-Kharga 72511, Egypt

2. Computer Science Department, Faculty of Computers and Information, Assiut University, Assiut, Egypt

3. Information System Department, Faculty of Computers and Information, Assiut University, Assiut, Egypt

Abstract

In this modern era, each and everything is computerized, and everyone has their own smart gadgets to communicate with others around the globe without any range limitations. Most of the communication pathways belong to smart applications, call options in smartphones, and other multiple ways, but e-mail communication is considered the main professional communication pathway, which allows business people as well as commercial and noncommercial organizations to communicate with one another or globally share some important official documents and reports. This global pathway attracts many attackers and intruders to do a scam with such innovations; in particular, the intruders generate false messages with some attractive contents and post them as e-mails to global users. This kind of unnecessary and not needed advertisement or threatening mails is considered as spam mails, which usually contain advertisements, promotions of a concern or institution, and so on. These mails are also considered or called junk mails, which will be reflected as the same category. In general, e-mails are the usual way of message delivery for business oriented as well as any official needs, but in some cases there is a necessity of transferring some voice instructions or messages to the destination via the same e-mail pathway. These kinds of voice-oriented e-mail accessing are called voice mails. The voice mail is generally composed to deliver the speech aspect instructions or information to the receiver to do some particular tasks or convey some important messages to the receiver. A voice-mail-enabled system allows users to communicate with one another based on speech input which the sender can communicate to the receiver via voice conversations, which is used to deliver voice information to the recipient. These kinds of mails are usually generated using personal computers or laptops and exchanged via general e-mail pathway, or separate paid and nonpaid mail gateways are available to deal with certain mail transactions. The above-mentioned e-mail spam is considered in many past researches and attains some solutions, but in case of voice-based e-mail aspect, there will be no options to manage such kind of security parameters. In this paper, a hybrid data processing mechanism is handled with respect to both text-enabled and voice-enabled e-mails, which is called Genetic Decision Tree Processing with Natural Language Processing (GDTPNLP). This proposed approach provides a way of identifying the e-mail spam in both textual e-mails and speech-enabled e-mails. The proposed approach of GDTPNLP provides higher spam detection rate in terms of text extraction speed, performance, cost efficiency, and accuracy. These all will be explained in detail with graphical output views in the Results and Discussion.

Funder

Academy of Scientific Research and Technology

Publisher

Hindawi Limited

Subject

General Mathematics,General Medicine,General Neuroscience,General Computer Science

Reference26 articles.

1. E-mail spam filtering by a new hybrid feature selection method using Chi2 and CNB Wrapper;P. S. Mostafa;International Journal of Emerging Sciences,2013

2. Parameter tuning in decision tree based on genetic algorithm for text classification;A. I. Taloba;International Journal of Scientific Engineering and Research,2019

3. An Overview of Principal Component Analysis

4. Intelligent Detection Approaches for Spam

5. Hybrid email spam detection model with negative selection algorithm and differential evolution

Cited by 21 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Using Machine Learning and Natural Language Processing for Unveiling Similarities between Microbial Data;Mathematics;2024-08-30

2. Privacy-aware quantum convolutional neural network for blockchain-based IoT health care data;Intelligent Decision Technologies;2024-06-07

3. Feature Selection and Classification of Email Spam Using Orthogonal Linear Jellyfish Swarm Optimizer;2024 Third International Conference on Distributed Computing and Electrical Circuits and Electronics (ICDCECE);2024-04-26

4. A hybrid correlation-based deep learning model for email spam classification using fuzzy inference system;Decision Analytics Journal;2024-03

5. Classification of Spam and Ham Emails with Machine Learning Techniques for Cyber Security;2023 International Conference on Integrated Intelligence and Communication Systems (ICIICS);2023-11-24

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3