Comparing Machine Learning and Deep Learning Techniques for Text Analytics: Detecting the Severity of Hate Comments Online-Reference-Cited by-同舟云学术

Comparing Machine Learning and Deep Learning Techniques for Text Analytics: Detecting the Severity of Hate Comments Online

Published:2023-11-24 Issue: Volume: Page:
ISSN:1387-3326
Container-title:Information Systems Frontiers
language:en
Short-container-title:Inf Syst Front

Author:

Marshan Alaa^ORCID,Nizar Farah Nasreen Mohamed,Ioannou Athina,Spanaki Konstantina

Abstract

AbstractSocial media platforms have become an increasingly popular tool for individuals to share their thoughts and opinions with other people. However, very often people tend to misuse social media posting abusive comments. Abusive and harassing behaviours can have adverse effects on people's lives. This study takes a novel approach to combat harassment in online platforms by detecting the severity of abusive comments, that has not been investigated before. The study compares the performance of machine learning models such as Naïve Bayes, Random Forest, and Support Vector Machine, with deep learning models such as Convolutional Neural Network (CNN) and Bi-directional Long Short-Term Memory (Bi-LSTM). Moreover, in this work we investigate the effect of text pre-processing on the performance of the machine and deep learning models, the feature set for the abusive comments was made using unigrams and bigrams for the machine learning models and word embeddings for the deep learning models. The comparison of the models’ performances showed that the Random Forest with bigrams achieved the best overall performance with an accuracy of (0.94), a precision of (0.91), a recall of (0.94), and an F1 score of (0.92). The study develops an efficient model to detect severity of abusive language in online platforms, offering important implications both to theory and practice.

Publisher

Springer Science and Business Media LLC

Subject

Computer Networks and Communications,Information Systems,Theoretical Computer Science,Software

Link

https://link.springer.com/content/pdf/10.1007/s10796-023-10446-x.pdf

Reference90 articles.

1. Abro, S., et al. (2020). Automatic hate speech detection using machine learning: A comparative study. International Journal of Advanced Computer Science and Applications, 11(8), 484–491. https://doi.org/10.14569/IJACSA.2020.0110861

2. Al-Ajlan, M. A., & Ykhlef, M. (2018). Optimized twitter cyberbullying detection based on deep learning. In 21st Saudi Computer Society National Computer Conference, NCC 2018, pp. 1–5. https://doi.org/10.1109/NCG.2018.8593146

3. Alam, S., & Yao, N. (2019). The impact of preprocessing steps on the accuracy of machine learning algorithms in sentiment analysis. Computational and Mathematical Organization Theory, 25(3), 319–335. https://doi.org/10.1007/s10588-018-9266-8

4. Al-Garadi, M. A., Varathan, K. D., & Ravana, S. D. (2016). Cybercrime detection in online communications: The experimental case of cyberbullying detection in the Twitter network. Computers in Human Behavior, 63, 433–443. https://doi.org/10.1016/j.chb.2016.05.051

5. Au, C. H., Ho, K. K. W., & Chiu, D. K. W. (2021). The role of online misinformation and fake news in ideological polarization: barriers, catalysts, and implications. Information Systems Frontiers, 1331–1354. https://doi.org/10.1007/s10796-021-10133-9

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. MedT5SQL: a transformers-based large language model for text-to-SQL conversion in the healthcare domain;Frontiers in Big Data;2024-06-26