An Ensemble-Based Multi-Classification Machine Learning Classifiers Approach to Detect Multiple Classes of Cyberbullying
-
Published:2024-01-12
Issue:1
Volume:6
Page:156-170
-
ISSN:2504-4990
-
Container-title:Machine Learning and Knowledge Extraction
-
language:en
-
Short-container-title:MAKE
Author:
Alqahtani Abdulkarim Faraj12, Ilyas Mohammad1
Affiliation:
1. Electrical Engineering and Computer Science, Florida Atlantic University, 777 Glades Road, Boca Raton, FL 33431, USA 2. Ministry of National Guard, King Khalid Military Academy, Riyadh 14625, Saudi Arabia
Abstract
The impact of communication through social media is currently considered a significant social issue. This issue can lead to inappropriate behavior using social media, which is referred to as cyberbullying. Automated systems are capable of efficiently identifying cyberbullying and performing sentiment analysis on social media platforms. This study focuses on enhancing a system to detect six types of cyberbullying tweets. Employing multi-classification algorithms on a cyberbullying dataset, our approach achieved high accuracy, particularly with the TF-IDF (bigram) feature extraction. Our experiment achieved high performance compared with that stated for previous experiments on the same dataset. Two ensemble machine learning methods, employing the N-gram with TF-IDF feature-extraction technique, demonstrated superior performance in classification. Three popular multi-classification algorithms: Decision Trees, Random Forest, and XGBoost, were combined into two varied ensemble methods separately. These ensemble classifiers demonstrated superior performance compared to traditional machine learning classifier models. The stacking classifier reached 90.71% accuracy and the voting classifier 90.44%. The results of the experiments showed that the framework can detect six different types of cyberbullying more efficiently, with an accuracy rate of 0.9071.
Subject
Artificial Intelligence,Engineering (miscellaneous)
Reference36 articles.
1. Boyd, D., Golder, S., and Lotan, G. (2010, January 5–8). Tweet, tweet, retweet: Conversational aspects of retweeting on twitter. Proceedings of the 2010 43rd Hawaii International Conference on System Sciences, Honolulu, HI, USA. 2. Adolescents and cyber bullying: The precaution adoption process model;Chapin;Educ. Inf. Technol.,2016 3. Nobata, C., Tetreault, J., Thomas, A., Mehdad, Y., and Chang, Y. (2016, January 11–15). Abusive language detection in online user content. Proceedings of the 25th International Conference on World Wide Web, Montreal, QC, Canada. 4. Un-compromised credibility: Social media based multi-class hate speech classification for text;Qureshi;IEEE Access,2021 5. Qiu, S., Xu, B., Zhang, J., Wang, Y., Shen, X., De Melo, G., Long, C., and Li, X. (2020, January 20–24). Easyaug: An automatic textual data augmentation platform for classification tasks. Proceedings of the Companion Proceedings of the Web Conference, Taipei, Taiwan.
|
|