Author:
Koshekov K T,Kobenko V Yu,Anayatova R K,Savostin A A,Koshekov A K
Abstract
Abstract
This article addresses the problem of developing an effective method for automatically classifying the aviation personnel emotions (announcer) by voice. To this end, it is possible to create a dictatorial independent algorithm capable of performing a multi-grade classification of the seven emotional states of a person (joy, fear, anger, sadness, disgust, surprise and neutrality) on the basis of a set of 48 informative features. These features are formed from the digital recording of the speech signal by calculating Mel Frequency Cepstral coefficient and the main tone frequency for individual recording frames. The increase of informativeness and the reduction of the dimension for the Mel Frequency Cepstral coefficient is achieved by processing said coefficients with the aid of a deep, convergent neural network. The model of the classifier is realized by means of logistic regression, which was trained on the basis of emotionally colored English speech samples by these informative features. As a result of the training on the test sample, the correct recognition response accuracy is equal to 0.96. The inventive solution can be used for improving human-machine interfaces, as well as in the field of aviation, medicine, marketing etc.
Subject
General Physics and Astronomy
Reference20 articles.
1. Pitch Extraction and Fundamental Frequency: History and Current Technique;Gerhard,2003