An efficient approach for data-imbalanced hate speech detection in Arabic social media-Reference-Cited by-同舟云学术

An efficient approach for data-imbalanced hate speech detection in Arabic social media

Published:2023-10-04 Issue:4 Volume:45 Page:6381-6390
ISSN:1064-1246
Container-title:Journal of Intelligent & Fuzzy Systems
language:
Short-container-title:IFS

Author:

Mohamed Mohamed S.¹,Elzayady Hossam¹,Badran Khaled M.¹,Salama Gouda I.¹

Affiliation:

1. Department of Computer Engineering and Artificial Intelligence, Military Technical College, Egypt

Abstract

The use of hateful language in public debates and forums is becoming more common. However, this might result in antagonism and conflicts among individuals, which is undesirable in an online environment. Countries, businesses, and educational institutions are exerting their greatest efforts to develop effective solutions to manage this issue. In addition, recognizing such content is difficult, particularly in Arabic, due to a variety of challenges and constraints. Long-tailed data distribution is often one of the most significant issues in actual Arabic hate speech datasets. Pre-trained models, such as bidirectional encoder representations from transformers (BERT) and generative pre-trained transformers (GPT), have become more popular in numerous natural language processing (NLP) applications in recent years. We conduct extensive experiments to address data imbalance issues by utilizing oversampling methods and a focal loss function in addition to traditional loss functions. Quasi-recurrent neural networks (QRNN) are employed to fine-tune the cutting-edge transformer-based models, MARBERTv2, MARBERTv1, and ARBERT. In this context, we suggest a new approach using ensemble learning that incorporates best-performing models for both original and oversampled datasets. Experiments proved that our proposed approach achieves superior performance compared to the most advanced methods described in the literature.

Publisher

IOS Press

Subject

Artificial Intelligence,General Engineering,Statistics and Probability

Reference24 articles.

1. Enhancing Detection of Arabic Social Spam Using Data Augmentation and Machine Learning;Alkadri;Applied Sciences,2022

2. Arabicdialects: An efficient framework for Arabic dialects opinion mining on twitter using optimized deep neural networks;Abdelminaam;IEEE Access,2021

3. Detecting Hateful and Offensive Speech in Arabic Social Media Using Transfer Learning;Boulouard;Applied Sciences,2022

4. Socialnetwork security: Issues, challenges, threats, and solutions;Rathore;Information Sciences,2017

5. Intelligent detection of hate speech in Arabic social network: A machine learning approach;Aljarah;Journal of Information Science,2021

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A comprehensive review on Arabic offensive language and hate speech detection on social media: methods, challenges and solutions;Social Network Analysis and Mining;2024-05-30

2. SMOTE for enhancing Tunisian Hate Speech detection on social media with machine learning;International Journal of Hybrid Intelligent Systems;2024-05-08