A Novel S-LDA Features for Automatic Emotion Recognition from Speech using 1-D CNN
-
Published:2022-01-01
Issue:1
Volume:7
Page:49-67
-
ISSN:2455-7749
-
Container-title:International Journal of Mathematical, Engineering and Management Sciences
-
language:en
-
Short-container-title:Int J Math, Eng, Manag Sci
Author:
Tiwari Pradeep1, Darji A. D.2
Affiliation:
1. Department of Electronics Engineering, Sardar Vallabhbhai National Institute of Technology, Surat, Gujrat, India. Department of Electronics and Telecommunication Engineering, Mukesh Patel School of Technology Management and Engineering, NMIMS University, Mumbai, India. 2. Department of Electronics Engineering, Sardar Vallabhbhai National Institute of Technology, Surat, India.
Abstract
Emotions are explicit and serious mental activities, which find expression in speech, body gestures and facial features, etc. Speech is a fast, effective and the most convenient mode of human communication. Hence, speech has become the most researched modality in Automatic Emotion Recognition (AER). To extract the most discriminative and robust features from speech for Automatic Emotion Recognition (AER) recognition has yet remained a challenge. This paper, proposes a new algorithm named Shifted Linear Discriminant Analysis (S-LDA) to extract modified features from static low-level features like Mel-Frequency Cepstral Coefficients (MFCC) and Pitch. Further 1-D Convolution Neural Network (CNN) was applied to these modified features for extracting high-level features for AER. The performance evaluation of classification task for the proposed techniques has been carried out on the three standard databases: Berlin EMO-DB emotional speech database, Surrey Audio-Visual Expressed Emotion (SAVEE) database and eNTERFACE database. The proposed technique has shown to outperform the results obtained using state of the art techniques. The results shows that the best accuracy obtained for AER using the eNTERFACE database is 86.41%, on the Berlin database is 99.59% and with SAVEE database is 99.57%.
Publisher
International Journal of Mathematical, Engineering and Management Sciences plus Mangey Ram
Subject
General Engineering,General Business, Management and Accounting,General Mathematics,General Computer Science
Reference43 articles.
1. Akçay, M.B., & Oğuz, K. (2020). Speech emotion recognition: emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers. Speech Communication, 116, 56-76. 2. Alsariera, Y.A., Adeyemo, V.E., Balogun, A.O., & Alazzawi, A.K. (2020). Ai meta-learners and extra-trees algorithm for the detection of phishing websites. IEEE Access, 8, 142532-142542. 3. Bozkurt, E., Erzin, E., Erdem, C.E., & Erdem, A.T. (2011). Formant position based weighted spectral features for emotion recognition. Speech Communication, 53(9-10), 1186-1197. 4. Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W.F., & Weiss, B. (2005). A database of German emotional speech. In 2005 9th European Conference on Speech Communication and Technology Interspeech (Vol. 5, pp. 1517-1520). Lisbon, Portugal. 5. Busso, C., Lee, S., & Narayanan, S. (2009). Analysis of emotionally salient aspects of fundamental frequency for emotion detection. IEEE Transactions on Audio, Speech, and Language Processing, 17(4), 582-596.
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|