Optimized Multimodal Emotional Recognition Using Long Short-Term Memory-Reference-Cited by-同舟云学术

Optimized Multimodal Emotional Recognition Using Long Short-Term Memory

Published:2024-07-01 Issue:1 Volume:3 Page:9-17
ISSN:2583-7370
Container-title:Contemporaneity of English Language and Literature in the Robotized Millennium
language:en
Short-container-title:cellrm

Author:

Abstract

The aim of this project is to research and classification on human emotions. A new method for the recognition of speech signals has been introduced. It’s called LSTM (Long-Short Term Memory). It is a type of Recurrent neural network. RNN is used for analyzing sequential data, hence it is useful for speech signal recognition. Several Datasets were found across the internet for this project. Ex: TESS (Toronto Emotional Speech Set), RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song), SAVEE (Surrey Audio-Visual Expressed Emotion), CREMA-D (Crowd- Sourced Emotional Multimodal Actors Dataset). The Main Dataset used in this project is TESS (Toronto Emotional Speech Set) Dataset and Mel Frequency Cepstral Coefficient (MFCC) is Used for Feature extraction.

Publisher

REST Publisher

Reference10 articles.

1. [1]. Abbaschian, B.J.; Sierra-Sosa, D.; Elmaghraby, A. Deep Learning Techniques for Speech Emotion Recognition, from Databases to Models. Sensors 2021, 21, 1249.

2. [2]. W.Zheng, M. Xin, X. Wang and B. Wang, "A Novel Speech Emotion Recognition Method via Incomplete Sparse Least Square Regression," in IEEE Signal Processing Letters, vol. 21, no. 5, pp. 569-572, May 2014, doi: 10.1109/LSP.2014.2308954

3. [3]. K.Wang, N. An, B. N. Li, Y. Zhang and L. Li, "Speech Emotion Recognition Using Fourier Parameters," in IEEE Transactions on Affective Computing, vol. 6, no. 1, pp. 69-75, 1 Jan.-March 2015, doi: 10.1109/TAFFC.2015.2392101.

4. [4]. R.H.Aljuhani, A. Alshutayri and S. Alahdal, "Arabic Speech Emotion Recognition From Saudi Dialect Corpus," in IEEE Access, vol. 9, pp. 127081-127085, 2021, doi: 10.1109/ACCESS.2021.3110992.

5. [5]. L.M. Zhang, G. W. Ng, Y. -B. Leau and H. Yan, "A Parallel-Model Speech Emotion Recognition Network Based on Feature Clustering," in IEEE Access, vol. 11, pp. 71224-71234, 2023, doi: 10.1109/ACCESS.2023.3294274.