Developing a sentence level fairness metric using word embeddings-Reference-Cited by-同舟云学术

Developing a sentence level fairness metric using word embeddings

Published:2022-10-10 Issue: Volume: Page:
ISSN:2524-7832
Container-title:International Journal of Digital Humanities
language:en
Short-container-title:Int J Digit Humanities

Author:

Izzidien Ahmed^ORCID,Fitz Stephen,Romero Peter,Loe Bao S.,Stillwell David

Abstract

AbstractFairness is a principal social value that is observable in civilisations around the world. Yet, a fairness metric for digital texts that describe even a simple social interaction, e.g., ‘The boy hurt the girl’ has not been developed. We address this by employing word embeddings that use factors found in a new social psychology literature review on the topic. We use these factors to build fairness vectors. These vectors are used as sentence level measures, whereby each dimension reflects a fairness component. The approach is employed to approximate human perceptions of fairness. The method leverages a pro-social bias within word embeddings, for which we obtain an F1 = 79.8 on a list of sentences using the Universal Sentence Encoder (USE). A second approach, using principal component analysis (PCA) and machine learning (ML), produces an F1 = 86.2. Repeating these tests using Sentence Bidirectional Encoder Representations from Transformers (SBERT) produces an F1 = 96.9 and F1 = 100 respectively. Improvements using subspace representations are further suggested. By proposing a first-principles approach, the paper contributes to the analysis of digital texts along an ethical dimension.

Funder

NGI Trust

The Psychometrics Centre, Cambridge Judge Business School Small Grants Scheme

The Isaac Newton Trust

Publisher

Springer Science and Business Media LLC

Subject

General Medicine

Link

https://link.springer.com/content/pdf/10.1007/s42803-022-00049-4.pdf

Reference101 articles.

1. Aiello, L. M., Quercia, D., Zhou, K., Constantinides, M., Šćepanović, S., & Joglekar, S. (2021). How epidemic psychology works on Twitter: Evolution of responses to the COVID-19 pandemic in the US. Humanities and Social Sciences Communications, 8(1), 1–15.

2. Araque, O., Gatti, L., & Kalimeri, K. (2020). MoralStrength: Exploiting a moral lexicon and embedding similarity for moral foundations prediction. Knowledge-Based Systems, 191, 105184.

3. Bahdanau, D., Cho, K., & Bengio, Y. (2016). Neural machine translation by jointly learning to align and translate. ArXiv:1409.0473 [Cs, Stat]. http://arxiv.org/abs/1409.0473. Retrieved December 21, 2021.

4. Barbieri, F., Camacho-Collados, J., Espinosa Anke, L., & Neves, L. (2020). TweetEval: Unified benchmark and comparative evaluation for tweet classification. Findings of the Association for Computational Linguistics: EMNLP 2020, 1644–1650. https://doi.org/10.18653/v1/2020.findings-emnlp.148

5. Bartling, B., & Fischbacher, U. (2012). Shifting the blame: On delegation and responsibility. The Review of Economic Studies, 79(1), 67–87. https://doi.org/10.1093/restud/rdr023