Paralinguistic and spectral feature extraction for speech emotion classification using machine learning techniques-Reference-Cited by-同舟云学术

Paralinguistic and spectral feature extraction for speech emotion classification using machine learning techniques

Published:2023-05-15 Issue:1 Volume:2023 Page:
ISSN:1687-4722
Container-title:EURASIP Journal on Audio, Speech, and Music Processing
language:en
Short-container-title:J AUDIO SPEECH MUSIC PROC.

Author:

Liu Tong,Yuan Xiaochen^ORCID

Abstract

AbstractEmotion plays a dominant role in speech. The same utterance with different emotions can lead to a completely different meaning. The ability to perform various of emotion during speaking is also one of the typical characters of human. In this case, technology trends to develop advanced speech emotion classification algorithms in the demand of enhancing the interaction between computer and human beings. This paper proposes a speech emotion classification approach based on the paralinguistic and spectral features extraction. The Mel-frequency cepstral coefficients (MFCC) are extracted as spectral feature, and openSMILE is employed to extract the paralinguistic feature. The machine learning techniques multi-layer perceptron classifier and support vector machines are respectively applied into the extracted features for the classification of the speech emotions. We have conducted experiments on the Berlin database to evaluate the performance of the proposed approach. Experimental results show that the proposed approach achieves satisfied performances. Comparisons are conducted in clean condition and noisy condition respectively, and the results indicate better performance of the proposed scheme.

Funder

Research project of the Macao Polytechnic University

Publisher

Springer Science and Business Media LLC

Subject

Electrical and Electronic Engineering,Acoustics and Ultrasonics

Link

https://link.springer.com/content/pdf/10.1186/s13636-023-00290-x.pdf

Reference39 articles.

1. X. Cao, M. Jia, J. Ru, T.w. Pai, Cross-corpus speech emotion recognition using subspace learning and domain adaption. EURASIP J. Audio Speech Music Process. 2022(1), 32 (2022)

2. K. Wang, N. An, B.N. Li, Y. Zhang, L. Li, Speech emotion recognition using fourier parameters. IEEE Trans. Affect. Comput. 6(1), 69–75 (2015)

3. D. Tang, P. Kuppens, L. Geurts, T. van Waterschoot, End-to-end speech emotion recognition using a novel context-stacking dilated convolution neural network. EURASIP J. Audio Speech Music Process. 2021(1), 18 (2021)

4. L. Sun, S. Fu, F. Wang, Decision tree svm model with fisher feature selection for speech emotion recognition. EURASIP J. Audio Speech Music Process. 2019(1), 1–14 (2019)

5. P. Ekman, An argument for basic emotions. Cogn. Emot. 6(3–4), 169–200 (1992)

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. CNN CLASSIFICATION OF SOYBEANS WITH STORAGE TIME BASED ON NEAR INFRARED SPECTROSCOPY;Engenharia Agrícola;2023-12

2. Design an Optimum Feature Selection Method to Improve the Accuracy of the Speech Recognition System;SN Computer Science;2023-08-29