Gender-Driven English Speech Emotion Recognition with Genetic Algorithm-Reference-Cited by-同舟云学术

Gender-Driven English Speech Emotion Recognition with Genetic Algorithm

Published:2024-06-14 Issue:6 Volume:9 Page:360
ISSN:2313-7673
Container-title:Biomimetics
language:en
Short-container-title:Biomimetics

Author:

Yue Liya¹,Hu Pei²,Zhu Jiulong¹

Affiliation:

1. Fanli Business School, Nanyang Institute of Technology, Nanyang 473004, China

2. School of Computer and Software, Nanyang Institute of Technology, Nanyang 473004, China

Abstract

Speech emotion recognition based on gender holds great importance for achieving more accurate, personalized, and empathetic interactions in technology, healthcare, psychology, and social sciences. In this paper, we present a novel gender–emotion model. First, gender and emotion features were extracted from voice signals to lay the foundation for our recognition model. Second, a genetic algorithm (GA) processed high-dimensional features, and the Fisher score was used for evaluation. Third, features were ranked by their importance, and the GA was improved through novel crossover and mutation methods based on feature importance, to improve the recognition accuracy. Finally, the proposed algorithm was compared with state-of-the-art algorithms on four common English datasets using support vector machines (SVM), and it demonstrated superior performance in accuracy, precision, recall, F1-score, the number of selected features, and running time. The proposed algorithm faced challenges in distinguishing between neutral, sad, and fearful emotions, due to subtle vocal differences, overlapping pitch and tone variability, and similar prosodic features. Notably, the primary features for gender-based differentiation mainly involved mel frequency cepstral coefficients (MFCC) and log MFCC.

Funder

Support Program for Scientific and Technological Innovation Teams in Universities in Henan Province

Publisher

MDPI AG

Link

https://www.mdpi.com/2313-7673/9/6/360/pdf

Reference40 articles.

1. Bhushan, B. (2023, January 28–29). Optimal Feature Learning for Speech Emotion Recognition—A DeepNet Approach. Proceedings of the 2023 International Conference on Data Science and Network Security (ICDSNS), Tiptur, India.

2. A comprehensive review of speech emotion recognition systems;Wani;IEEE Access,2021

3. CREMA-D: Improving Accuracy with BPSO-Based Feature Selection for Emotion Recognition Using Speech;Donuk;J. Soft Comput. Artif. Intell.,2022

4. A survey of speech emotion recognition in natural environment;Fahad;Digit. Signal Process.,2021

5. Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers;Speech Commun.,2020