M3GAT: A Multi-modal, Multi-task Interactive Graph Attention Network for Conversational Sentiment Analysis and Emotion Recognition

Author:

Zhang Yazhou1ORCID,Jia Ao2ORCID,Wang Bo3ORCID,Zhang Peng3ORCID,Zhao Dongming4ORCID,Li Pu1ORCID,Hou Yuexian3ORCID,Jin Xiaojia4ORCID,Song Dawei2ORCID,Qin Jing5ORCID

Affiliation:

1. Software Engineering College, Zhengzhou University of Light Industry, China

2. School of Computer Science and Technology, Beijing Institute of Technology, China

3. College of Intelligence and Computing, Tianjin University, China

4. Artificial Intelligence Laboratory, China Mobile Communication Group Tianjin Co., Ltd., China

5. Centre for Smart Health, School of Nursing, Hong Kong Polytechnic University, China

Abstract

Sentiment and emotion, which correspond to long-term and short-lived human feelings, are closely linked to each other, leading to the fact that sentiment analysis and emotion recognition are also two interdependent tasks in natural language processing (NLP). One task often leverages the shared knowledge from another task and performs better when solved in a joint learning paradigm. Conversational context dependency, multi-modal interaction, and multi-task correlation are three key factors that contribute to this joint paradigm. However, none of the recent approaches have considered them in a unified framework. To fill this gap, we propose a multi-modal, multi-task interactive graph attention network, termed M3GAT, to simultaneously solve the three problems. At the heart of the model is a proposed interactive conversation graph layer containing three core sub-modules, which are: (1) local-global context connection for modeling both local and global conversational context, (2) cross-modal connection for learning multi-modal complementary and (3) cross-task connection for capturing the correlation across two tasks. Comprehensive experiments on three benchmarking datasets, MELD, MEISD, and MSED, show the effectiveness of M3GAT over state-of-the-art baselines with the margin of 1.88%, 5.37%, and 0.19% for sentiment analysis, and 1.99%, 3.65%, and 0.13% for emotion recognition, respectively. In addition, we also show the superiority of multi-task learning over the single-task framework.

Funder

The Hong Kong Polytechnic University

National Science Foundation of China

Novel Software Technology in Nanjing University

Industrial Science and Technology Research Project of Henan Province

Foundation of Key Laboratory of Dependable Service Computing in Cyber-Physical-Society (Ministry of Education), Chongqing University

Natural Science Foundation of Henan

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Science Applications,General Business, Management and Accounting,Information Systems

Reference59 articles.

1. Md Shad Akhtar Dushyant Singh Chauhan Deepanway Ghosal Soujanya Poria Asif Ekbal and Pushpak Bhattacharyya. 2019. Multi-task learning for multi-modal emotion recognition and sentiment analysis. arXiv preprint arXiv:1905.05812 (2019).

2. Sentiment and Emotion help Sarcasm? A Multi-task Learning Framework for Multi-Modal Sarcasm, Sentiment and Emotion Analysis

3. Ze-Jing Chuang and Chung-Hsien Wu. 2004. Multi-modal emotion recognition from speech and text. In International Journal of Computational Linguistics & Chinese Language Processing, Volume 9, Number 2, August 2004: Special Issue on New Trends of Speech and Language Processing. 45–62.

4. Elizabeth M. Daly and Mads Haahr. 2008. Social network analysis for information flow in disconnected delay-tolerant MANETs. IEEE Transactions on Mobile Computing 8 5 (2008) 606–621.

5. Jacob Devlin Ming-Wei Chang Kenton Lee and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Self-Adaptive Representation Learning Model for Multi-Modal Sentiment and Sarcasm Joint Analysis;ACM Transactions on Multimedia Computing, Communications, and Applications;2024-01-11

2. Moving From Narrative to Interactive Multi-Modal Sentiment Analysis: A Survey;ACM Transactions on Asian and Low-Resource Language Information Processing;2023-07-22

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3