Abstract
The interest in author profiling tasks has increased in the research community because computer applications have shown success in different sectors such as security, marketing, healthcare, and others. Recognition and identification of traits such as gender, age or location based on text data can help to improve different marketing strategies. This type of technology has been widely discussed regarding documents taken from social media. However, its methods have been poorly studied using data with a more formal structure, where there is no access to emoticons, mentions, and other linguistic phenomena that are only present in social media. This paper proposes the use of recurrent and convolutional neural networks and a transfer learning strategy to recognize two demographic traits, i.e., gender and language variety, in documents written in informal and formal language. The models were tested in two different databases consisting of tweets (informal) and call-center conversations (formal). Accuracies of up to 75 % and 68 % were achieved in the recognition of gender in documents with informal and formal language, respectively. Moreover, regarding language variety recognition, accuracies of 92 % and 72 % were obtained in informal and formal text scenarios, respectively. The results indicate that, in relation to the traits considered in this paper, it is possible to transfer the knowledge from a system trained on a specific type of expressions to another one where the structure is completely different and data are scarcer.
Publisher
Instituto Tecnologico Metropolitano (ITM)
Reference36 articles.
1. F. Chiu Hsieh; R. F. Sandroni Dias; I. Paraboni, “Author profiling from Facebook corpora,” in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), pp. 2566- 2570, 2018. https://aclanthology.org/L18-1407.pdf
2. O. Dogan; B. Oztaysi, “Gender prediction from classified indoor customer paths by fuzzy C-medoids clustering,” in Intelligent and Fuzzy Techniques in Big Data Analytics and Decision Making INFUS 2019. Advances in Intelligent Systems and Computing, vol 1029. Springer, Cham., pp. 160–169. https://doi.org/10.1007/978-3-030-23756-1_21
3. R. Hirt; N. Kühl; G. Satzger, “Cognitive computing for customer profiling: meta classification for gender prediction,” Electron. Mark., vol. 39, no. 1, pp. 93–106, Feb. 2019. https://doi.org/10.1007/s12525-019-00336-z
4. D. Fernandez-Lanvin; J. de Andres-Suarez; M. Gonzalez-Rodriguez; B. Pariente-Martinez, “The dimension of age and gender as user model demographic factors for automatic personalization in e-commerce sites,” Comput. Stand. Interfaces, vol. 59, pp. 1–9, Aug. 2018. https://doi.org/10.1016/j.csi.2018.02.001
5. M. Arroju; A. Hassan; G. Farnadi, “Age, gender and personality recognition using tweets in a multilingual setting Notebook for PAN at CLEF 2015”. in 6th Conference and Labs of the Evaluation Forum (CLEF), 2015, pp. 23-31. https://biblio.ugent.be/publication/7100086
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献