English Speech Emotion Classification Based on Multi-Objective Differential Evolution-Reference-Cited by-同舟云学术

English Speech Emotion Classification Based on Multi-Objective Differential Evolution

Published:2023-11-13 Issue:22 Volume:13 Page:12262
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Yue Liya¹,Hu Pei²^ORCID,Chu Shu-Chuan³,Pan Jeng-Shyang³⁴

Affiliation:

1. Fanli Business School, Nanyang Institute of Technology, Nanyang 473004, China

2. School of Computer and Software, Nanyang Institute of Technology, Nanyang 473004, China

3. College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China

4. Department of Information Management, Chaoyang University of Technology, Taichung 413310, Taiwan

Abstract

Speech signals involve speakers’ emotional states and language information, which is very important for human–computer interaction that recognizes speakers’ emotions. Feature selection is a common method for improving recognition accuracy. In this paper, we propose a multi-objective optimization method based on differential evolution (MODE-NSF) that maximizes recognition accuracy and minimizes the number of selected features (NSF). First, the Mel-frequency cepstral coefficient (MFCC) features and pitch features are extracted from speech signals. Then, the proposed algorithm implements feature selection where the NSF guides the initialization, crossover, and mutation of the algorithm. We used four English speech emotion datasets, and K-nearest neighbor (KNN) and random forest (RF) classifiers to validate the performance of the proposed algorithm. The results illustrate that MODE-NSF is superior to other multi-objective algorithms in terms of the hypervolume (HV), inverted generational distance (IGD), Pareto optimal solutions, and running time. MODE-NSF achieved an accuracy of 49% using eNTERFACE05, 53% using the Ryerson audio-visual database of emotional speech and song (RAVDESS), 76% using Surrey audio-visual expressed emotion (SAVEE) database, and 98% using the Toronto emotional speech set (TESS). MODE-NSF obtained good recognition results, which provides a basis for the establishment of emotional models.

Funder

Henan Provincial Philosophy and Social Science Planning Project

Henan Province Key Research and Development and Promotion Special Project

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/13/22/12262/pdf

Reference34 articles.

1. Hasija, T., Kadyan, V., Guleria, K., Alharbi, A., Alyami, H., and Goyal, N. (2022). Prosodic feature-based discriminatively trained low resource speech recognition system. Sustainability, 14.

2. Arslan, R.S., and Barışçı, N. (2019). Development of output correction methodology for long short term memory-based speech recognition. Sustainability, 11.

3. Zhao, Z.D., Zhao, M.S., Lu, H.L., Wang, S.H., and Lu, Y.Y. (2023). Digital Mapping of Soil pH Based on Machine Learning Combined with Feature Selection Methods in East China. Sustainability, 15.

4. Biomedical Named Entity Recognition Based on Feature Selection and Word Representations;Song;J. Inf. Hiding Multim. Signal Process.,2016

5. Yuan, S., Ji, Y., Chen, Y., Liu, X., and Zhang, W. (2023). An Improved Differential Evolution for Parameter Identification of Photovoltaic Models. Sustainability, 15.

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Multimodal ML Strategies for Wind Turbine Condition Monitoring in Heterogeneous IoT Data Environments;Lecture Notes in Networks and Systems;2024

2. Genetic Algorithm for High-Dimensional Emotion Recognition from Speech Signals;Electronics;2023-11-25