Fine-Grained Multi-label Sexism Classification Using a Semi-Supervised Multi-level Neural Approach-Reference-Cited by-同舟云学术

Fine-Grained Multi-label Sexism Classification Using a Semi-Supervised Multi-level Neural Approach

Published:2021-08-17 Issue:4 Volume:6 Page:359-379
ISSN:2364-1185
Container-title:Data Science and Engineering
language:en
Short-container-title:Data Sci. Eng.

Author:

Abburi Harika^ORCID,Parikh Pulkit,Chhaya Niyati,Varma Vasudeva

Abstract

AbstractSexism, a permeate form of oppression, causes profound suffering through various manifestations. Given the increasing number of experiences of sexism shared online, categorizing these recollections automatically can support the battle against sexism, since it can promote successful evaluations by gender studies researchers and government representatives engaged in policy making. In this paper, we examine the fine-grained, multi-label classification of accounts (reports) of sexism. To the best of our knowledge, we consider substantially more categories of sexism than any related prior work through our 23-class problem formulation. Moreover, we present the first semi-supervised work for the multi-label classification of accounts describing any type(s) of sexism. We devise self-training-based techniques tailor-made for the multi-label nature of the problem to utilize unlabeled samples for augmenting the labeled set. We identify high textual diversity with respect to the existing labeled set as a desirable quality for candidate unlabeled instances and develop methods for incorporating it into our approach. We also explore ways of infusing class imbalance alleviation for multi-label classification into our semi-supervised learning, independently and in conjunction with the method involving diversity. In addition to data augmentation methods, we develop a neural model which combines biLSTM and attention with a domain-adapted BERT model in an end-to-end trainable manner. Further, we formulate a multi-level training approach in which models are sequentially trained using categories of sexism of different levels of granularity. Moreover, we devise a loss function that exploits any label confidence scores associated with the data. Several proposed methods outperform various baselines on a recently released dataset for multi-label sexism categorization across several standard metrics.

Publisher

Springer Science and Business Media LLC

Subject

Computer Science Applications,Computational Mechanics

Link

https://link.springer.com/content/pdf/10.1007/s41019-021-00168-y.pdf

Reference48 articles.

1. Abburi H, Parikh P, Chhaya N, Varma V (2020) Semi-supervised multi-task learning for multi-label fine-grained sexism classification. In: Proceedings of the 28th international conference on computational linguistics, international committee on computational linguistics, Barcelona, Spain (Online), pp 5810–5820

2. Abney S (2007) Semisupervised learning for computational linguistics. Chapman and Hall/CRC

3. Agrawal S, Awekar A (2018) Deep learning for detecting cyberbullying across multiple social media platforms. In: European Conference on Information Retrieval, Springer, pp 141–153

4. Anzovino M, Fersini E, Rosso P (2018) Automatic identification and classification of misogynistic language on twitter. In: International conference on applications of natural language to information systems, Springer, pp 57–64

5. Badjatiya P, Gupta S, Gupta M, Varma V (2017) Deep learning for hate speech detection in tweets. In: Proceedings of the 26th international conference on world wide web companion, International World Wide Web Conferences Steering Committee, pp 759–760

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Multi-task learning neural framework for categorizing sexism;Computer Speech & Language;2024-01

2. Comparing Fine-Tuning, Zero and Few-Shot Strategies with Large Language Models in Hate Speech Detection in English;Computer Modeling in Engineering & Sciences;2024

3. A Deep Dive into Automated Sexism Detection Using Fine-Tuned Deep Learning and Large Language Models;2024

4. Humans feel too special for machines to score their morals;PNAS Nexus;2023-05-29

5. UMP-MG: A Uni-directed Message-Passing Multi-label Generation Model for Hierarchical Text Classification;Data Science and Engineering;2023-04-24