Affiliation:
1. Department of Informatics, Constantine the Philosopher University in Nitra, Faculty of Natural Science andInformatics, Trieda A. Hlinku 1, Nitra, Nitra, Slovakia
Abstract
The paper deals with the issue of classification of emotional state from speech. Due to the applied k-NN algorithm, the original solution achieved an overall classification success in the range of 20 to 35%, depending on the used audio sample input data database. In the original application, we have used the Praat program to extract the characteristics. In the current version of the application, the use of Praat has been eliminated and we have developed our solution based on neural networks. Therefore, 3 experiments with forward, 1 and 2D convolutional neural networks were performed to determine the overall success of the classification. Their common feature is that the prediction success was always highest in tests with a test subset of the RAVDESS database, with the best result being obtained using a 1D convolutional network (78.93%). Tests with the EMO-DB database were successful at 35.76%, 31.75% and 25.49%. In all three experiments, the worst results were obtained in tests with the SAVEE database - 20.24%, 18.45% and 22.02%.
Subject
Artificial Intelligence,General Engineering,Statistics and Probability