Affiliation:
1. University of Zagreb , Faculty of Electrical Engineering and Computing
Abstract
Abstract
This paper aims to provide results of empirical experiments on the accuracy of different machine learning algorithms for detecting spam messages, using a public dataset of spam messages. The originality of our study lies in the integration of topic modeling, specifically employing Latent Dirichlet Allocation (LDA) alongside machine learning algorithms for spam detection. By extracting hidden topics and uncovering patterns in spam and non-spam messages, we provide unique insights into the distinguishing characteristics of spam messages. Moreover, the integration of machine learning is a powerful tool in bolstering risk control measures ensuring the sustainability of digital platforms and communication channels. The research tests the accuracy of spam detection classifiers on an open-source dataset of spam messages. The key findings of this study reveal that the Logistic Regression classifier achieved the highest F score of 0.986, followed by the Support Vector Machine classifier with a score of 0.98 and the Naive Bayes classifier with a score of 0.955. The study concludes that Logistic Regression outperforms Naive Bayes and Support Vector Machine in text classification, particularly in spam detection, emphasizing the role of machine learning techniques in optimizing risk management strategies for sustained digital ecosystems. This capability stems from Logistic Regression’s adeptness in modeling complex relationships, enabling it to achieve high accuracy on training and test datasets.
Reference51 articles.
1. Ahmed, N., Amin, R., Aldabbas, H., Koundal, D., Alouffi, B., & Shah, T. (2022). Machine Learning Techniques for Spam Detection in Email and IoT Platforms: Analysis and Research Challenges. Security and Communication Networks, 1862888. https://doi.org/10.1155/2022/1862888
2. Alghoul, A., Ajrami, S. A., Jarousha, G. A., & Abu-Naser, S. S. (2018, November 30). Email Classification Using Artificial Neural Network. International Journal for Academic Development, 2(11), 8–14.
3. Awad, W. A., & ELseuofi, S. M. (2011). Machine learning methods for spam e-mail classification. International Journal of Computer Science and Information Technologies, 3(1), 173–184.
4. Bagić Babac, M. (2023). Emotion analysis of user reactions to online news. Information Discovery and Delivery, 51(2), 179–193. https://doi.org/10.1108/IDD-04-2022-0027
5. Bassiouni, M., Ali, M., & El-Dahshan, E. A. (2018). Ham and spam e-mails classification using machine learning techniques. Journal of Applied Security Research, 13(3), 315–331. https://doi.org/10.1080/19361610.2018.1463136
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献