Abstract
PurposeIn this digital era, email is the most pervasive form of communication between people. Many users become a victim of spam emails and their data have been exposed.Design/methodology/approachResearchers contribute to solving this problem by a focus on advanced machine learning algorithms and improved models for detecting spam emails but there is still a gap in features. To achieve good results, features also play an important role. To evaluate the performance of applied classifiers, 10-fold cross-validation is used.FindingsThe results approve that the spam emails are correctly classified with the accuracy of 98.00% for the Support Vector Machine and 98.06% for the Artificial Neural Network as compared to other applied machine learning classifiers.Originality/valueIn this paper, Point-Biserial correlation is applied to each feature concerning the class label of the University of California Irvine (UCI) spambase email dataset to select the best features. Extensive experiments are conducted on selected features by training the different classifiers.
Subject
Computer Science Applications,Information Systems,Software
Reference17 articles.
1. Data pre-processing in spam detection;Int J Sci Technol Eng,2015
2. Spammer classification using ensemble methods over structural social network features,2014
3. Interplay between probabilistic classifiers and boosting algorithms for detecting complex unsolicited emails;J Adv Comp Netw,2013
4. Ham and spam e-mails classification using machine learning techniques;J Appl Security Res,2018
5. A lifelong spam emails classification model;Appl Comput Inform,2020
Cited by
15 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献