Abstract
AbstractResearchers and industry developers in artificial intelligence (AI) and natural language processing (NLP) have uniformly adopted a Rawlsian definition of fairness. On this definition, a technology is fair if performance is maximized for the least advantaged. We argue this definition has considerable loopholes, which can be used to legitimize common practices in AI/NLP research that actively contributes to social and economic inequalities. Such practices include what we shall refer to as Subgroup Test Ballooning and Snapshot-Representative Evaluation. Subgroup Test Ballooning refers to the practice of initially tailoring a technology to a specific target group of technology-ready early adopters to collect feedback faster. Snapshot-Representative Evaluation refers to the practice of evaluating a technology on a representative sample of current end users. Both strategies may contribute to social and economic inequalities but are commonly justified using arguments familiar from political economics and grounded in Rawlsian fairness. We discuss an egalitarian alternative to Rawlsian fairness, as well as, more generally, the roadblocks on the path toward globally and socially fair AI/NLP research and development.
Publisher
Springer Science and Business Media LLC
Subject
General Earth and Planetary Sciences
Reference32 articles.
1. Bender, E.M., Friedman, B.: Data statements for natural language processing: Toward mitigating system bias and enabling better science. Trans Assoc Computational Linguist 6, 587–604 (2018). https://doi.org/10.1162/tacl_a_00041
2. Williamson R., Menon A. Fairness risk measures. In: Chaudhuri K., Salakhutdinov R. (eds.) Proceedings of the 36th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 97, pp. 6786–6797. PMLR, Long Beach, California (2019). https://proceedings.mlr.press/v97/williamson19a.html
3. Larson B. Gender as a variable in natural-language processing: Ethical considerations. In: Proceedings of the First ACL Workshop on Ethics in Natural Language Processing, pp. 1–11. Association for Computational Linguistics, Valencia, Spain (2017). https://doi.org/10.18653/v1/W17-1601. https://aclanthology.org/W17-1601
4. Vig J., Gehrmann S., Belinkov Y., Qian S., Nevo D., Singer Y., Shieber S. Investigating gender bias in language models using causal mediation analysis. In: Larochelle H., Ranzato M., Hadsell R., Balcan, M.F., Lin H. (eds.) Advances in Neural Information Processing Systems, vol. 33, pp. 12388–12401. Curran Associates, Inc., Vancouver, CA (2020). https://proceedings.neurips.cc/paper/2020/file/92650b2e92217715fe312e6fa7b90d82-Paper.pdf
5. Ethayarajh K., Jurafsky D. Utility is in the eye of the user: A critique of NLP leaderboards. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 4846–4853. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.emnlp-main.393. https://aclanthology.org/2020.emnlp-main.393
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献