A Method for Fast Selection of Machine-Learning Classifiers for Spam Filtering-Reference-Cited by-同舟云学术

A Method for Fast Selection of Machine-Learning Classifiers for Spam Filtering

Published:2021-08-27 Issue:17 Volume:10 Page:2083
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Rapacz Sylwia,Chołda Piotr^ORCID,Natkaniec Marek^ORCID

Abstract

The paper elaborates on how text analysis influences classification—a key part of the spam-filtering process. The authors propose a multistage meta-algorithm for checking classifier performance. As a result, the algorithm allows for the fast selection of the best-performing classifiers as well as for the analysis of higher-dimensionality data. The last aspect is especially important when analyzing large datasets. The approach of cross-validation between different datasets for supervised learning is applied in the meta-algorithm. Three machine-learning methods allowing a user to classify e-mails as desirable (ham) or potentially harmful (spam) messages were compared in the paper to illustrate the operation of the meta-algorithm. The used methods are simple, but as the results showed, they are powerful enough. We use the following classifiers: k-nearest neighbours (k-NNs), support vector machines (SVM), and the naïve Bayes classifier (NB). The conducted research gave us the conclusion that multinomial naïve Bayes classifier can be an excellent weapon in the fight against the constantly increasing amount of spam messages. It was also confirmed that the proposed solution gives very accurate results.

Funder

Narodowe Centrum Badań i Rozwoju

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering

Link

https://www.mdpi.com/2079-9292/10/17/2083/pdf

Reference47 articles.

1. 15 Outrageous Email Spam Statistics that Still Ring True in 2018https://www.propellercrm.com/blog/email-spam-statistics

2. Internet Security Threat Reporthttps://www.symantec.com/content/dam/symantec/docs/reports/istr-24-2019-en.pdf

3. The history of digital spam

4. Machine learning for email spam filtering: review, approaches and open research problems

5. Machine Learning Methods for Spam E-Mail Classification

Cited by 14 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Next-Generation Spam Filtering: Comparative Fine-Tuning of LLMs, NLPs, and CNN Models for Email Spam Classification;Electronics;2024-05-23

2. Email Guard: Enhancing Security Through Spam Detection;Algorithms for Intelligent Systems;2024

3. Automated Spam Detection Using Sandpiper Optimization Algorithm-Based Feature Selection with the Machine Learning Model;IETE Journal of Research;2023-11-23

4. Enhancing Spam Email Classification using Multilayer Perceptron: Performance Analysis and Comparative Evaluation;2023 5th International Conference on Pattern Analysis and Intelligent Systems (PAIS);2023-10-25

5. Wireless Local Area Networks Threat Detection Using 1D-CNN;Sensors;2023-06-12