Affiliation:
1. Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
Abstract
The recent rapid growth in the number of Saudi female athletes and sports enthusiasts’ presence on social media has exposed them to gender-hate speech and discrimination. Hate speech, a harmful worldwide phenomenon, can have severe consequences. Its prevalence in sports has surged alongside the growing influence of social media, with X serving as a prominent platform for the expression of hate speech and discriminatory comments, often targeting women in sports. This research combines two studies that explores online hate speech and gender biases in the context of sports, proposing an automated solution for detecting hate speech targeting women in sports on platforms like X, with a particular focus on Arabic, a challenging domain with limited prior research. In Study 1, semi-structured interviews with 33 Saudi female athletes and sports fans revealed common forms of hate speech, including gender-based derogatory comments, misogyny, and appearance-related discrimination. Building upon the foundations laid by Study 1, Study 2 addresses the pressing need for effective interventions to combat hate speech against women in sports on social media by evaluating machine learning (ML) models for identifying hate speech targeting women in sports in Arabic. A dataset of 7,487 Arabic tweets was collected, annotated, and pre-processed. Term frequency-inverse document frequency (TF-IDF) and part-of-speech (POS) feature extraction techniques were used, and various ML algorithms were trained Random Forest consistently outperformed, achieving accuracy (85% and 84% using TF-IDF and POS, respectively) compared to other methods, demonstrating the effectiveness of both feature sets in identifying Arabic hate speech. The research contribution advances the understanding of online hate targeting Arabic women in sports by identifying various forms of such hate. The systematic creation of a meticulously annotated Arabic hate speech dataset, specifically focused on women’s sports, enhances the dataset’s reliability and provides valuable insights for future research in countering hate speech against women in sports. This dataset forms a strong foundation for developing effective strategies to address online hate within the unique context of women’s sports. The research findings contribute to the ongoing efforts to combat hate speech against women in sports on social media, aligning with the objectives of Saudi Arabia’s Vision 2030 and recognizing the significance of female participation in sports.
Funder
Princess Nourah bint Abdulrahman University Researchers Supporting Project
Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
Reference50 articles.
1. Quick and simple approach for detecting hate speech in Arabic tweets;Abuzayed,2020
2. Detection of hate speech in Arabic tweets using deep learning;Al-Hassan;Multimedia Systems,2022
3. The reality of women’s sport in Saudi society;Al-Shahrani;International Journal of Human Movement and Sports Sciences,2020
4. ALP: an Arabic linguistic pipeline;Abed Alhakim;Analysis and Application of Natural Language and Speech Processing,2022
5. A comparative study of Arabic part of speech taggers using literary text samples from Saudi novels;Alluhaibi;Information,2021