Interaction between feature subset selection techniques and machine learning classifiers for detecting unsolicited emails-Reference-Cited by-同舟云学术

Interaction between feature subset selection techniques and machine learning classifiers for detecting unsolicited emails

Published:2014-03 Issue:1 Volume:14 Page:53-61
ISSN:1559-6915
Container-title:ACM SIGAPP Applied Computing Review
language:en
Short-container-title:SIGAPP Appl. Comput. Rev.

Author:

Trivedi Shrawan Kumar¹,Dey Shubhamoy¹

Affiliation:

1. Indian Institute of Management, Prabandh Shikhar, Rau, Indore, India

Abstract

Detection of the spam emails within a set of email files has become challenging task for researchers. Identification of an effective classifier is based not only on high accuracy of detection but also on low false alarm rates, and the need to use as few features as possible. In view of these challenges, this research examines the effects of using features selected by four feature subset selection methods (i.e. Genetic, Greedy Stepwise, Best First, and Rank Search) on popular Machine Learning Classifiers like Bayesian, Naive Bayes, Support Vector Machine, Genetic Algorithm, J48 and Random Forest. Tests were performed on three different publicly available spam email datasets: "Enron", "SpamAssassin" and "LingSpam". Results show that, Greedy Stepwise Search method is a good method for feature subset selection for spam email detection. Among the Machine Learning Classifiers, Support Vector Machine has been found to be the best classifier both in terms of accuracy and False Positive rate. However, results of Random Forest were very close to that of Support Vector Machine. The Genetic classifier was identified as a weak classifier.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/2600617.2600622

Reference22 articles.

1. Whittaker S. Bellotti V. & Moody P. (2005). Introduction to this special issue on revisiting and reinventing email. Human--Computer Interaction 20(1-2) 1--9. 10.1207/s15327051hci2001%262_1 Whittaker S. Bellotti V. & Moody P. (2005). Introduction to this special issue on revisiting and reinventing email. Human--Computer Interaction 20(1-2) 1--9. 10.1207/s15327051hci2001%262_1

2. An empirical study of three machine learning methods for spam filtering

3. Spam and the ongoing battle for the inbox

Cited by 18 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. What prompts consumers to purchase online? A machine learning approach;Electronic Commerce Research;2022-11-09

2. Machine-learning approach to analyze on-the-job data via cloud services;Materials Today: Proceedings;2021-02

3. Resilience of mobile ad hoc networks to security attacks and optimization of routing process;Materials Today: Proceedings;2020-11

4. Ensemble Decision for Spam Detection Using Term Space Partition Approach;IEEE Transactions on Cybernetics;2020-01

5. Detecting False Messages in the Smartphone Fault Reporting System;Advances in Intelligent Systems and Computing;2019-11-02