Tackling racial bias in automated online hate detection: Towards fair and accurate detection of hateful users with geometric deep learning-Reference-Cited by-同舟云学术

Tackling racial bias in automated online hate detection: Towards fair and accurate detection of hateful users with geometric deep learning

Published:2022-02-14 Issue:1 Volume:11 Page:
ISSN:2193-1127
Container-title:EPJ Data Science
language:en
Short-container-title:EPJ Data Sci.

Author:

Ahmed Zo,Vidgen Bertie,Hale Scott A.^ORCID

Abstract

AbstractOnline hate is a growing concern on many social media platforms, making them unwelcoming and unsafe. To combat this, technology companies are increasingly developing techniques to automatically identify and sanction hateful users. However, accurate detection of such users remains a challenge due to the contextual nature of speech, whose meaning depends on the social setting in which it is used. This contextual nature of speech has also led to minoritized users, especially African–Americans, to be unfairly detected as ‘hateful’ by the very algorithms designed to protect them. To resolve this problem of inaccurate and unfair hate detection, research has focused on developing machine learning (ML) systems that better understand textual context. Incorporating social networks of hateful users has not received as much attention, despite social science research suggesting it provides rich contextual information. We present a system for more accurately and fairly detecting hateful users by incorporating social network information through geometric deep learning. Geometric deep learning is a ML technique that dynamically learns information-rich network representations. We make two main contributions: first, we demonstrate that adding network information with geometric deep learning produces a more accurate classifier compared with other techniques that either exclude network information entirely or incorporate it through manual feature engineering. Our best performing model achieves an AUC score of 90.8% on a previously released hateful user dataset. Second, we show that such information also leads to fairer outcomes: using the ‘predictive equality’ fairness criteria, we compare the false positive rates of our geometric learning algorithm to other ML techniques and find that our best-performing classifier has no false positives among a subset of African–American users. A neural network without network information has the largest number of false positives at 26, while a neural network incorporating manual network features has 13 false positives among African–American users. The system we present highlights the importance of effectively incorporating social network features in automated hateful user detection, raising new opportunities to improve how online hate is tackled.

Funder

Engineering and Physical Sciences Research Council

Publisher

Springer Science and Business Media LLC

Subject

Computational Mathematics,Computer Science Applications,Modeling and Simulation

Link

https://link.springer.com/content/pdf/10.1140/epjds/s13688-022-00319-9.pdf

Reference68 articles.

1. Alorainy W, Burnap P, Liu H, Williams ML (2019) “The enemy among us”: detecting cyber hate speech with threats-based othering language embeddings. ACM Trans Web 13(3). https://doi.org/10.1145/3324997

2. Barocas S, Hardt M (2017) NIPS 2017 Tutorial: fairness in machine learning. https://arxiv.org/abs/2005.03909

3. Blodgett SL, Green L, O’Connor B (2016) Demographic dialectal variation in social media: a case study of African–American English. In: EMNLP 2016—conference on empirical methods in natural language processing, proceedings. https://doi.org/10.18653/v1/d16-1120

4. Bordes A, Usunier N, Garcia-Durán A, Weston J, Yakhnenko O (2013) Translating embeddings for modeling multi-relational data. In: Advances in Neural Information Processing Systems, Neural Information Processing Systems Foundation

5. Bowman-Grieve L (2009) Exploring stormfront: a virtual community of the radical right. Stud Confl Terrorism 32(11):989–1007. https://doi.org/10.1080/10576100903259951

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Transfer learning approach for identifying negative sentiment in tweets directed to football players;Engineering Applications of Artificial Intelligence;2024-07

2. “Keep Your Heads Held High Boys!”: Examining the Relationship between the Proud Boys’ Online Discourse and Offline Activities;American Political Science Review;2024-02-13

3. Current Topological and Machine Learning Applications for Bias Detection in Text;2023 6th International Conference on Signal Processing and Information Security (ICSPIS);2023-11-08

4. Enhancing social network hate detection using back translation and GPT-3 augmentations during training and test-time;Information Fusion;2023-11

5. BERT-Based Logits Ensemble Model for Gender Bias and Hate Speech Detection;J INF PROCESS SYST;2023