Case study of features extraction and real time classification of emotion from speech on the basis with using neural nets-Reference-Cited by-同舟云学术

Case study of features extraction and real time classification of emotion from speech on the basis with using neural nets

Published:2022-09-22 Issue:5 Volume:43 Page:5399-5415
ISSN:1064-1246
Container-title:Journal of Intelligent & Fuzzy Systems
language:
Short-container-title:IFS

Author:

Magdin Martin¹,Sulka Timotej¹,Fodor Kristián¹

Affiliation:

1. Department of Informatics, Constantine the Philosopher University in Nitra, Faculty of Natural Science andInformatics, Trieda A. Hlinku 1, Nitra, Nitra, Slovakia

Abstract

The paper deals with the issue of classification of emotional state from speech. Due to the applied k-NN algorithm, the original solution achieved an overall classification success in the range of 20 to 35%, depending on the used audio sample input data database. In the original application, we have used the Praat program to extract the characteristics. In the current version of the application, the use of Praat has been eliminated and we have developed our solution based on neural networks. Therefore, 3 experiments with forward, 1 and 2D convolutional neural networks were performed to determine the overall success of the classification. Their common feature is that the prediction success was always highest in tests with a test subset of the RAVDESS database, with the best result being obtained using a 1D convolutional network (78.93%). Tests with the EMO-DB database were successful at 35.76%, 31.75% and 25.49%. In all three experiments, the worst results were obtained in tests with the SAVEE database - 20.24%, 18.45% and 22.02%.

Publisher

IOS Press

Subject

Artificial Intelligence,General Engineering,Statistics and Probability

Reference22 articles.

1. Survey on speech emotion recognition: Features, classification schemes, and databases;El Ayadi;Pattern Recognition,2011

2. Emotion recognition from speech using global and local prosodic features;Rao;International Journal of Speech Technology,2013

3. Emotion recognition using mel-frequency cepstral coefficients;Sato;Journal of Natural Language Processing,2007