Abstract
The use of inclusive language, among many other gender equality initiatives in society, has garnered great attention in recent years. Gender equality offices in universities and public administration cannot cope with the task of manually checking the use of non-inclusive language in the documentation that those institutions generate. In this research, an automated solution for the detection of non-inclusive uses of the Spanish language in doctoral theses generated in Spanish universities is introduced using machine learning techniques. A large dataset has been used to train, validate, and analyze the use of inclusive language; the result is an algorithm that detects, within any Spanish text document, non-inclusive uses of the language with error, false positive, and false negative ratios slightly over 10%, and precision, recall, and F-measure percentages over 86%. Results also show the evolution with time of the ratio of non-inclusive usages per document, having a pronounced reduction in the last years under study.
Subject
Computer Science Applications,Media Technology,Communication,Business and International Management,Library and Information Sciences
Reference80 articles.
1. ¿Qué? Quoi? Do Languages with Grammatical Gender Promote Sexist Attitudes?
2. Es Sexista La Lengua Española? Una Investigación Sobre El Género Gramatical;Meseguer,1996
3. Toolkit on Gender-sensitive Communication,2018
4. Masculine generics and gender-aware alternatives in Spanish;Kaufmann,2014
5. Inclusive Use of Language, Guide for Authors,2020
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献