Addressing cyberbullying in Urdu tweets: a comprehensive dataset and detection system-Reference-Cited by-同舟云学术

Addressing cyberbullying in Urdu tweets: a comprehensive dataset and detection system

Published:2024-04-29 Issue: Volume:10 Page:e1963
ISSN:2376-5992
Container-title:PeerJ Computer Science
language:en
Short-container-title:

Author:

Adeeba Farah¹,Yousuf Muhammad Irfan¹,Anwer Izza²,Tariq Sardar Umair¹,Ashfaq Abdullah¹,Naqeeb Malik¹^ORCID

Affiliation:

1. Department of Computer Science, University of Engineering and Technology Lahore, Lahore, Punjab, Pakistan

2. Department of Transportation Engineering and Management, University of Engineering and Technology Lahore, Lahore, Punjab, Pakistan

Abstract

The prevalence of cyberbullying has reached an alarming rate, affecting approximately 54% of teenagers who experience various forms of cyberbullying, including offensive hate speech, threats, and racism. This research introduces a comprehensive dataset and system for cyberbullying detection in Urdu tweets, leveraging a spectrum of machine learning approaches including traditional models and advanced deep learning techniques. The objectives of this study are threefold. Firstly, a dataset consisting of 12,500 annotated tweets in Urdu is created, and it is made publicly available to the research community. Secondly, annotation guidelines for Urdu text with appropriate labels for cyberbullying detection are developed. Finally, a series of experiments is conducted to assess the performance of machine learning and deep learning techniques in detecting cyberbullying. The results indicate that fastText deep learning models outperform other models in cyberbullying detection. This study demonstrates its efficacy in effectively detecting and classifying cyberbullying incidents in Urdu tweets, contributing to the broader effort of creating a safer digital environment.

Publisher

PeerJ

Link

https://peerj.com/articles/cs-1963.pdf

Reference20 articles.

1. Cyberbullying on social media platforms among university students in the United Arab Emirates;Abaido;International Journal of Adolescence and Youth,2020

2. Cyberbullying Corpus;Adeeba,2024

3. Automatic abusive language detection in Urdu tweets;Amjad;Acta Polytechnica Hungarica,2022

4. Threatening language detection and target identification in Urdu tweets;Amjad;IEEE Access,2021

5. Cyberbullying detection: advanced preprocessing techniques & deep learning architecture for Roman Urdu data;Dewani;Journal of Big Data,2021a

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Hate Speech and Target Community Detection in Nastaliq Urdu Using Transfer Learning Techniques;IEEE Access;2024