Abstract
The prevalence of offensive content on online communication and social media platforms is growing more and more common, which makes its detection difficult, especially in multilingual settings. The term “Offensive Language” encompasses a wide range of expressions, including various forms of hate speech and aggressive content. Therefore, exploring multilingual offensive content, that goes beyond a single language, focus and represents more linguistic diversities and cultural factors. By exploring multilingual offensive content, we can broaden our understanding and effectively combat the widespread global impact of offensive language. This survey examines the existing state of multilingual offensive language detection, including a comprehensive analysis on previous multilingual approaches, and existing datasets, as well as provides resources in the field. We also explore the related community challenges on this task, which include technical, cultural, and linguistic ones, as well as their limitations. Furthermore, in this survey we propose several potential future directions toward more efficient solutions for multilingual offensive language detection, enabling safer digital communication environment worldwide.
Reference185 articles.
1. Temporal and second language influence on intra-annotator agreement and stability in hate speech labelling;Abercrombie,2023
2. Massively multilingual neural machine translation;Aharoni,2019
3. NLPDove at SemEval-2020 task 12: improving offensive language detection with cross-lingual transfer;Ahn,2020a
4. NLPdove at semeval-2020 task 12: improving offensive language detection with cross-lingual transfer;Ahn,2020b
5. Mega: multilingual evaluation of generative AI;Ahuja,2023
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献