Affiliation:
1. Key Laboratory for Artificial Intelligence and Cognitive Neuroscience of Language, Xi’an International Studies University, Xi’an 610116, China
2. Faculty of Computing and Informatics, Universiti Malaysia Sabah, Sabah 88400, Malaysia
Abstract
Speech reflects people’s mental state and using a microphone sensor is a potential method for human–computer interaction. Speech recognition using this sensor is conducive to the diagnosis of mental illnesses. The gender difference of speakers affects the process of speech emotion recognition based on specific acoustic features, resulting in the decline of emotion recognition accuracy. Therefore, we believe that the accuracy of speech emotion recognition can be effectively improved by selecting different features of speech for emotion recognition based on the speech representations of different genders. In this paper, we propose a speech emotion recognition method based on gender classification. First, we use MLP to classify the original speech by gender. Second, based on the different acoustic features of male and female speech, we analyze the influence weights of multiple speech emotion features in male and female speech, and establish the optimal feature sets for male and female emotion recognition, respectively. Finally, we train and test CNN and BiLSTM, respectively, by using the male and the female speech emotion feature sets. The results show that the proposed emotion recognition models have an advantage in terms of average recognition accuracy compared with gender-mixed recognition models.
Funder
Social Science Foundation of Shaanxi Province of China
National Social Science Foundation of China
Natural Science Basic Research Program of Shaanxi Province of China
Shaanxi Educational Science and Planning Foundation for “14th Five-Year Plan” of China
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Reference36 articles.
1. Human-Computer Interaction with Detection of Speaker Emotions Using Convolution Neural Networks;Alnuaim;Comput. Intell. Neurosci.,2022
2. A Comprehensive Review of Speech Emotion Recognition Systems;Wani;IEEE Access,2021
3. Multimodal interfaces of human–computer interaction;Karpov;Her. Russ. Acad. Sci.,2018
4. Speech emotion recognition approaches in human computer interaction;Ramakrishnan;Telecommun. Syst.,2013
5. Zisad, S.N., Hossain, M.S., and Andersson, K. (2020, January 19). Speech emotion recognition in neurological disorders using convolutional neural network. Proceedings of the International Conference on Brain Informatics, Padua, Italy.
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献