Affiliation:
1. Department of Computer Science, University of Maryland. ycao95@umd.edu
2. Microsoft Research, University of Maryland. me@hal3.name
Abstract
Abstract
Correctly resolving textual mentions of people fundamentally entails making inferences about those people. Such inferences raise the risk of systematic biases in coreference resolution systems, including biases that can harm binary and non-binary trans and cis stakeholders. To better understand such biases, we foreground nuanced conceptualizations of gender from sociology and sociolinguistics, and investigate where in the machine learning pipeline such biases can enter a coreference resolution system. We inspect many existing data sets for trans-exclusionary biases, and develop two new data sets for interrogating bias in both crowd annotations and in existing coreference resolution systems. Through these studies, conducted on English text, we confirm that without acknowledging and building systems that recognize the complexity of gender, we will build systems that fail for: quality of service, stereotyping, and over- or under-representation, especially for binary and non-binary trans users.
Subject
Artificial Intelligence,Computer Science Applications,Linguistics and Language,Language and Linguistics
Reference283 articles.
1. Acquisition system for Arabic noun morphology;Abuleil,2002
2. Syntactic and cognitive issues in investigating gendered coreference;Ackerman;Glossa,2019
3. Key female characters in film have more to talk about besides men: Automating the Bechdel test;Agarwal,2015
4. Evaluation of named entity coreference;Agarwal,2019
5. Entity-switched datasets: An approach to auditing the in-domain robustness of named entity recognition models;Agarwal;ArXiv,2020
Cited by
13 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. AI‐Based Surveillance Systems for Effective Attendance Management;Mathematical Models Using Artificial Intelligence for Surveillance Systems;2024-08-09
2. Transforming Dutch: Debiasing Dutch Coreference Resolution Systems for Non-binary Pronouns;The 2024 ACM Conference on Fairness, Accountability, and Transparency;2024-06-03
3. GazePointAR: A Context-Aware Multimodal Voice Assistant for Pronoun Disambiguation in Wearable Augmented Reality;Proceedings of the CHI Conference on Human Factors in Computing Systems;2024-05-11
4. Using ChatGPT to Generate Gendered Language;2023 31st Irish Conference on Artificial Intelligence and Cognitive Science (AICS);2023-12-07
5. A domain-specific language for describing machine learning datasets;Journal of Computer Languages;2023-08