Towards enhancing emotion recognition via multimodal framework-Reference-Cited by-同舟云学术

Towards enhancing emotion recognition via multimodal framework

Published:2023-01-30 Issue:2 Volume:44 Page:2455-2470
ISSN:1064-1246
Container-title:Journal of Intelligent & Fuzzy Systems
language:
Short-container-title:IFS

Author:

Akalya devi C.¹,Karthika Renuka D.¹,Pooventhiran G.²,Harish D.³,Yadav Shweta⁴,Thirunarayan Krishnaprasad⁴

Affiliation:

1. Department of Information Technology, PSG College of Technology, Coimbatore, India

2. Qualcomm India Private Limited Chennai, India

3. Software AG, Bangalore, India

4. Department of Computer Science and Engineering, Wright State University, Dayton, OH, USA

Abstract

Emotional AI is the next era of AI to play a major role in various fields such as entertainment, health care, self-paced online education, etc., considering clues from multiple sources. In this work, we propose a multimodal emotion recognition system extracting information from speech, motion capture, and text data. The main aim of this research is to improve the unimodal architectures to outperform the state-of-the-arts and combine them together to build a robust multi-modal fusion architecture. We developed 1D and 2D CNN-LSTM time-distributed models for speech, a hybrid CNN-LSTM model for motion capture data, and a BERT-based model for text data to achieve state-of-the-art results, and attempted both concatenation-based decision-level fusion and Deep CCA-based feature-level fusion schemes. The proposed speech and mocap models achieve emotion recognition accuracies of 65.08% and 67.51%, respectively, and the BERT-based text model achieves an accuracy of 72.60%. The decision-level fusion approach significantly improves the accuracy of detecting emotions on the IEMOCAP and MELD datasets. This approach achieves 80.20% accuracy on IEMOCAP which is 8.61% higher than the state-of-the-art methods, and 63.52% and 61.65% in 5-class and 7-class classification on the MELD dataset which are higher than the state-of-the-arts.

Publisher

IOS Press

Subject

Artificial Intelligence,General Engineering,Statistics and Probability

Reference15 articles.

1. Toward machine emotional intelligence: Analysis of affective physiologicalstate;Picard;IEEE Transactions on Pattern Analysis and Machine Intelligence,2001

2. Constants across cultures in the face and emotion,;Ekman;Journal of Personality and Social Psychology,1971

3. Iemocap: Interactive emotional dyadic motion capture database,;Busso;Language Resources and Evaluation,2008

4. CNN+LSTM Architecture for Speech Emotion Recognition with Data Augmentation

5. Designing affective video games to support the social-emotional development of teenagers with autism spectrum disorders,;Khandaker;Annual Review of Cybertherapy and Telemedicine,2009

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Context-Based Emotion Recognition: A Survey;2023