Twitter-based gender recognition using transformers-Reference-Cited by-同舟云学术

Twitter-based gender recognition using transformers

Published:2023 Issue:9 Volume:20 Page:15962-15981
ISSN:1551-0018
Container-title:Mathematical Biosciences and Engineering
language:
Short-container-title:MBE

Author:

Nia Zahra Movahedi¹²,Ahmadi Ali³⁴,Mellado Bruce¹⁵,Wu Jianhong¹²,Orbinski James¹⁶,Asgary Ali¹⁴,Kong Jude D.¹²

Affiliation:

1. Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), York University, Canada

2. Laboratory for Industrial and Applied Mathematics, York University, Canada

3. K.N Toosi University, Faculty of Computer Engineering, Tehran, Iran

4. Advanced Disaster, Emergency and Rapid-Response Simulation (ADERSIM), York University, Toronto, Ontario, Canada

5. School of Physics, Institute for Collider Particle Physics, University of Witwatersrand, Johannesburg, South Africa

6. Dahdaleh Institute for Global Health Research, York University, Canada

Abstract

<abstract> <p>Social media contains useful information about people and society that could help advance research in many different areas of health (e.g. by applying opinion mining, emotion/sentiment analysis and statistical analysis) such as mental health, health surveillance, socio-economic inequality and gender vulnerability. User demographics provide rich information that could help study the subject further. However, user demographics such as gender are considered private and are not freely available. In this study, we propose a model based on transformers to predict the user's gender from their images and tweets. The image-based classification model is trained in two different methods: using the profile image of the user and using various image contents posted by the user on Twitter. For the first method a Twitter gender recognition dataset, publicly available on Kaggle and for the second method the PAN-18 dataset is used. Several transformer models, i.e. vision transformers (ViT), LeViT and Swin Transformer are fine-tuned for both of the image datasets and then compared. Next, different transformer models, namely, bidirectional encoders representations from transformers (BERT), RoBERTa and ELECTRA are fine-tuned to recognize the user's gender by their tweets. This is highly beneficial, because not all users provide an image that indicates their gender. The gender of such users could be detected from their tweets. The significance of the image and text classification models were evaluated using the Mann-Whitney U test. Finally, the combination model improved the accuracy of image and text classification models by 11.73 and 5.26% for the Kaggle dataset and by 8.55 and 9.8% for the PAN-18 dataset, respectively. This shows that the image and text classification models are capable of complementing each other by providing additional information to one another. Our overall multimodal method has an accuracy of 88.11% for the Kaggle and 89.24% for the PAN-18 dataset and outperforms state-of-the-art models. Our work benefits research that critically require user demographic information such as gender to further analyze and study social media content for health-related issues.</p> </abstract>

Publisher

American Institute of Mathematical Sciences (AIMS)

Subject

Applied Mathematics,Computational Mathematics,General Agricultural and Biological Sciences,Modeling and Simulation,General Medicine

Reference60 articles.

1. J. Gao, P. Zheng, Y. Jia, H. Chen, Y. Mao, S. Chen, et al., Mental health problems and social media exposure during COVID-19 outbreak, PLOS ONE, 15 (2020). https://doi.org/10.1371/journal.pone.0231924

2. M. J. Aramburu, R. Berlanga, I. Lanza, Social media multidimensional analysis for intelligent health surveillance, Int. J. Env. Res. Public Health, 17 (2020), 2289. https://doi.org/10.3390/ijerph17072289

3. J. B. Whiting, J. C. Pickens, A. L. Sagers, M. PettyJohn, B. Davies, Trauma, social media, and #WhyIDidntReport: An analysis of twitter posts about reluctance to report sexual assault, J. Marital. Fam. Ther., 47 (2021), 749–766. https://doi.org/10.1111/jmft.12470

4. T. Simon, A. Goldberg, L. Aharonson-Daniel, D. Leykin, B. Adini, Twitter in the cross fire–the use of social media in the Westgate Mall terror attack in kenya, PLOS ONE, 9 (2014). https://doi.org/10.1371/journal.pone.0104136

5. G. Coppersmith, R. Leary, A. Fine, Natural language processing of social media as screening for suicide risk, Biomed. Inform. Insights, 10 (2018). https://doi.org/10.1177/1178222618792860