Detection of Emotion of Speech for RAVDESS Audio Using Hybrid Convolution Neural Network-Reference-Cited by-同舟云学术

Detection of Emotion of Speech for RAVDESS Audio Using Hybrid Convolution Neural Network

Published:2022-02-27 Issue: Volume:2022 Page:1-9
ISSN:2040-2309
Container-title:Journal of Healthcare Engineering
language:en
Short-container-title:Journal of Healthcare Engineering

Author:

Puri Tanvi¹^ORCID,Soni Mukesh²^ORCID,Dhiman Gaurav³⁴⁵^ORCID,Ibrahim Khalaf Osamah⁶^ORCID,alazzam Malik⁷^ORCID,Raza Khan Ihtiram⁸^ORCID

Affiliation:

1. ICT Ganpat University, Ahmedabad, Gujarat, India

2. Computer Science and Engineering, Jagran Lakecity University, Bhopal, India

3. Department of Computer Science, Government Bikram College of Commerce, Patiala, India

4. University Centre for Research and Development, Department of Computer Science and Engineering, Chandigarh University, Gharuan, Mohali, India

5. Department of Computer Science and Engineering, Graphic Era Deemed to be University, Dehradun, India

6. Al-Nahrain University, Baghdad, Iraq

7. Lone Star College-Victory Center, Houston, TX, USA

8. Computer Science Department, Jamia Hamdard University, Delhi, India

Abstract

Every human being has emotion for every item related to them. For every customer, their emotion can help the customer representative to understand their requirement. So, speech emotion recognition plays an important role in the interaction between humans. Now, the intelligent system can help to improve the performance for which we design the convolution neural network (CNN) based network that can classify emotions in different categories like positive, negative, or more specific. In this paper, we use the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) audio records. The Log Mel Spectrogram and Mel-Frequency Cepstral Coefficients (MFCCs) were used to feature the raw audio file. These properties were used in the classification of emotions using techniques, such as Long Short-Term Memory (LSTM), CNNs, Hidden Markov models (HMMs), and Deep Neural Networks (DNNs). For this paper, we have divided the emotions into three sections for males and females. In the first section, we divide the emotion into two classes as positive. In the second section, we divide the emotion into three classes such as positive, negative, and neutral. In the third section, we divide the emotions into 8 different classes such as happy, sad, angry, fearful, surprise, disgust expressions, calm, and fearful emotions. For these three sections, we proposed the model which contains the eight consecutive layers of the 2D convolution neural method. The purposed model gives the better-performed categories to other previously given models. Now, we can identify the emotion of the consumer in better ways.

Publisher

Hindawi Limited

Subject

Health Informatics,Biomedical Engineering,Surgery,Biotechnology

Link

http://downloads.hindawi.com/journals/jhe/2022/8472947.pdf

Reference31 articles.

1. Comparison between k-nn and svm method for speech emotion recognition;M. Khan;International Journal on Computer Science and Engineering,2011

2. Acoustic feature selection for automatic emotion recognition from speech

3. Survey on speech emotion recognition: Features, classification schemes, and databases

4. Emotion recognition from speech: a review

5. Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011

Cited by 25 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. An enhanced speech emotion recognition using vision transformer;Scientific Reports;2024-06-07

2. Self-supervised Learning for Speech Emotion Recognition Task Using Audio-visual Features and Distil Hubert Model on BAVED and RAVDESS Databases;Journal of Systems Science and Systems Engineering;2024-05-29

3. Enhancing masked facial expression recognition with multimodal deep learning;Multimedia Tools and Applications;2024-02-13

4. Machine Learning Approach for Detection of Speech Emotions for RAVDESS Audio Dataset;2024 Fourth International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT);2024-01-11

5. Detecting and Analyzing the Emotional Levels of a Person Through CBT Using MFCC and Lexicon-Based Approach;Lecture Notes in Networks and Systems;2024