Speech emotion recognition based on dynamic convolutional neural network-Reference-Cited by-同舟云学术

Speech emotion recognition based on dynamic convolutional neural network

Published:2023-03-08 Issue:1 Volume:10 Page:72-77
ISSN:2413-1660
Container-title:Journal of Computing and Electronic Information Management
language:
Short-container-title:JCEIM

Author:

Lin Ziyao,Hu Zhangfang,Zhu Kuilin

Abstract

In speech emotion recognition, the use of deep learning algorithms that extract and classify features of audio emotion samples usually requires the use of a large amount of resources, which makes the system more complex. This paper proposes a speech emotion recognition system based on dynamic convolutional neural network combined with bi-directional long and short-term memory network. On the one hand, the dynamic convolutional kernel allows the neural network to extract global dynamic emotion information, which can improve the performance while ensuring the computational power of the model, and on the other hand, the bi-directional long and short-term memory network enables the model to classify the emotion features more effectively with the temporal information. In this paper, we use CISIA Chinese speech emotion dataset, EMO-DB German emotion corpus and IEMOCAP English corpus to conduct experiments, and the average emotion recognition accuracy of the experimental results are 59.08%, 89.29% and 71.25%, which are 1.17%, 1.36% and 2.97% higher than the accuracy of speech emotion recognition systems using mainstream models, respectively. The effectiveness of the method in this paper is proved.

Publisher

Darcy & Roy Press Co. Ltd.

Reference28 articles.

1. Akçay M B, Oğuz K. Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers[J]. Speech Communication, 2020, 116: 56-76.

2. Abbaschian B J, Sierra-Sosa D, Elmaghraby A. Deep learning techniques for speech emotion recognition, from databases to models[J]. Sensors, 2021, 21(4): 1249.

3. Pandey S K, Shekhawat H S, Prasanna S R M. Deep learning techniques for speech emotion recognition: A review[C]//2019 29th International Conference Radioelektronika (RADIOELEKTRONIKA). IEEE, 2019: 1-6.

4. Issa D, Demirci M F, Yazici A. Speech emotion recognition with deep convolutional neural networks[J]. Biomedical Signal Processing and Control, 2020, 59: 101894.

5. Lee J, Tashev I. High-level feature representation using recurrent neural network for speech emotion recognition[C]//Interspeech 2015. 2015.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Virtual human speech emotion recognition based on multi-channel CNN: MFCC, LPC, and F0 features;Journal of Physics: Conference Series;2023-12-01