English Speech Emotion Classification Based on Multi-Objective Differential Evolution
-
Published:2023-11-13
Issue:22
Volume:13
Page:12262
-
ISSN:2076-3417
-
Container-title:Applied Sciences
-
language:en
-
Short-container-title:Applied Sciences
Author:
Yue Liya1, Hu Pei2ORCID, Chu Shu-Chuan3, Pan Jeng-Shyang34
Affiliation:
1. Fanli Business School, Nanyang Institute of Technology, Nanyang 473004, China 2. School of Computer and Software, Nanyang Institute of Technology, Nanyang 473004, China 3. College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China 4. Department of Information Management, Chaoyang University of Technology, Taichung 413310, Taiwan
Abstract
Speech signals involve speakers’ emotional states and language information, which is very important for human–computer interaction that recognizes speakers’ emotions. Feature selection is a common method for improving recognition accuracy. In this paper, we propose a multi-objective optimization method based on differential evolution (MODE-NSF) that maximizes recognition accuracy and minimizes the number of selected features (NSF). First, the Mel-frequency cepstral coefficient (MFCC) features and pitch features are extracted from speech signals. Then, the proposed algorithm implements feature selection where the NSF guides the initialization, crossover, and mutation of the algorithm. We used four English speech emotion datasets, and K-nearest neighbor (KNN) and random forest (RF) classifiers to validate the performance of the proposed algorithm. The results illustrate that MODE-NSF is superior to other multi-objective algorithms in terms of the hypervolume (HV), inverted generational distance (IGD), Pareto optimal solutions, and running time. MODE-NSF achieved an accuracy of 49% using eNTERFACE05, 53% using the Ryerson audio-visual database of emotional speech and song (RAVDESS), 76% using Surrey audio-visual expressed emotion (SAVEE) database, and 98% using the Toronto emotional speech set (TESS). MODE-NSF obtained good recognition results, which provides a basis for the establishment of emotional models.
Funder
Henan Provincial Philosophy and Social Science Planning Project Henan Province Key Research and Development and Promotion Special Project
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference34 articles.
1. Hasija, T., Kadyan, V., Guleria, K., Alharbi, A., Alyami, H., and Goyal, N. (2022). Prosodic feature-based discriminatively trained low resource speech recognition system. Sustainability, 14. 2. Arslan, R.S., and Barışçı, N. (2019). Development of output correction methodology for long short term memory-based speech recognition. Sustainability, 11. 3. Zhao, Z.D., Zhao, M.S., Lu, H.L., Wang, S.H., and Lu, Y.Y. (2023). Digital Mapping of Soil pH Based on Machine Learning Combined with Feature Selection Methods in East China. Sustainability, 15. 4. Biomedical Named Entity Recognition Based on Feature Selection and Word Representations;Song;J. Inf. Hiding Multim. Signal Process.,2016 5. Yuan, S., Ji, Y., Chen, Y., Liu, X., and Zhang, W. (2023). An Improved Differential Evolution for Parameter Identification of Photovoltaic Models. Sustainability, 15.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|