An analysis of unconscious gender bias in academic texts by means of a decision algorithm-Reference-Cited by-同舟云学术

An analysis of unconscious gender bias in academic texts by means of a decision algorithm

Published:2021-09-30 Issue:9 Volume:16 Page:e0257903
ISSN:1932-6203
Container-title:PLOS ONE
language:en
Short-container-title:PLoS ONE

Author:

Orgeira-Crespo Pedro^ORCID,Míguez-Álvarez Carla,Cuevas-Alonso Miguel,Rivo-López Elena

Abstract

Inclusive language focuses on using the vocabulary to avoid exclusion or discrimination, specially referred to gender. The task of finding gender bias in written documents must be performed manually, and it is a time-consuming process. Consequently, studying the usage of non-inclusive language on a document, and the impact of different document properties (such as author gender, date of presentation, etc.) on how many non-inclusive instances are found, is quite difficult or even impossible for big datasets. This research analyzes the gender bias in academic texts by analyzing a study corpus of more than 12,000 million words obtained from more than one hundred thousand doctoral theses from Spanish universities. For this purpose, an automated algorithm was developed to evaluate the different characteristics of the document and look for interactions between age, year of publication, gender or the field of knowledge in which the doctoral thesis is framed. The algorithm identified information patterns using a CNN (convolutional neural network) by the creation of a vector representation of the sentences. The results showed evidence that there was a greater bias as the age of the authors increased, who were more likely to use non-inclusive terms; it was concluded that there is a greater awareness of inclusiveness in women than in men, and also that this awareness grows as the candidate is younger. The results showed evidence that the age of the authors increased discrimination, with men being more likely to use non-inclusive terms (up to an index of 23.12), showing that there is a greater awareness of inclusiveness in women than in men in all age ranges (with an average of 14.99), and also that this awareness grows as the candidate is younger (falling down to 13.07). In terms of field of knowledge, the humanities are the most biased (20.97), discarding the subgroup of Linguistics, which has the least bias at all levels (9.90), and the field of science and engineering, which also have the least influence (13.46). Those results support the assumption that the bias in academic texts (doctoral theses) is due to unconscious issues: otherwise, it would not depend on the field, age, gender, and would occur in any field in the same proportion. The innovation provided by this research lies mainly in the ability to detect, within a textual document in Spanish, whether the use of language can be considered non-inclusive, based on a CNN that has been trained in the context of the doctoral thesis. A significant number of documents have been used, using all accessible doctoral theses from Spanish universities of the last 40 years; this dataset is only manageable by data mining systems, so that the training allows identifying the terms within the context effectively and compiling them in a novel dictionary of non-inclusive terms.

Publisher

Public Library of Science (PLoS)

Subject

Multidisciplinary

Reference95 articles.

1. (act/emp), B.f.E.A., Breaking barriers: unconscious gender bias in the workplace. 2017, International Labour Office.

2. U.S. Study Shows Unconscious Gender Bias in Academic Science;J. Mervis;Science,2012

3. McKinsey_Company, Women in the workplace. 2020.

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. New insights into the rural development economies under the moderating role of gender equality and mediating role of rural women development;Journal of Rural Studies;2023-12

2. Characterizing Bias in Word Embeddings Towards Analyzing Gender Associations in Philippine Texts;2023 IEEE World Conference on Applied Intelligence and Computing (AIC);2023-07-29

3. Distribution of Female and Male First and Last Authorship across Drug Delivery Related Journals with Respect to Year and Journal Impact Factor;Molecular Pharmaceutics;2023-06-23

4. Reframing data ethics in research methods education: a pathway to critical data literacy;International Journal of Educational Technology in Higher Education;2023-02-20

5. Are Companies Committed to Preventing Gender Violence against Women? The Role of the Manager’s Implicit Resistance;Social Sciences;2022-12-26