Multi-Output Learning Based on Multimodal GCN and Co-Attention for Image Aesthetics and Emotion Analysis-Reference-Cited by-同舟云学术

Multi-Output Learning Based on Multimodal GCN and Co-Attention for Image Aesthetics and Emotion Analysis

Published:2021-06-20 Issue:12 Volume:9 Page:1437
ISSN:2227-7390
Container-title:Mathematics
language:en
Short-container-title:Mathematics

Author:

Miao Haotian^ORCID,Zhang Yifei,Wang Daling,Feng Shi

Abstract

With the development of social networks and intelligent terminals, it is becoming more convenient to share and acquire images. The massive growth of the number of social images makes people have higher demands for automatic image processing, especially in the aesthetic and emotional perspective. Both aesthetics assessment and emotion recognition require a higher ability for the computer to simulate high-level visual perception understanding, which belongs to the field of image processing and pattern recognition. However, existing methods often ignore the prior knowledge of images and intrinsic relationships between aesthetic and emotional perspectives. Recently, machine learning and deep learning have become powerful methods for researchers to solve mathematical problems in computing, such as image processing and pattern recognition. Both images and abstract concepts can be converted into numerical matrices and then establish the mapping relations using mathematics on computers. In this work, we propose an end-to-end multi-output deep learning model based on multimodal Graph Convolutional Network (GCN) and co-attention for aesthetic and emotion conjoint analysis. In our model, a stacked multimodal GCN network is proposed to encode the features under the guidance of the correlation matrix, and a co-attention module is designed to help the aesthetics and emotion feature representation learn from each other interactively. Experimental results indicate that our proposed model achieves competitive performance on the IAE dataset. Progressive results on the AVA and ArtPhoto datasets also prove the generalization ability of our model.

Publisher

MDPI AG

Subject

General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)

Link

https://www.mdpi.com/2227-7390/9/12/1437/pdf

Cited by 14 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Multimodal Emotion Recognition with Deep Learning: Advancements, challenges, and future directions;Information Fusion;2024-05

2. Image Aesthetics Assessment With Emotion-Aware Multibranch Network;IEEE Transactions on Instrumentation and Measurement;2024

3. Quantifying image naturalness using transfer learning and fusion model;Multimedia Tools and Applications;2023-12-11

4. Research Progress on the Aesthetic Quality Assessment of Complex Layout Images Based on Deep Learning;Applied Sciences;2023-08-29

5. Adaptive sentiment analysis using multioutput classification: a performance comparison;PeerJ Computer Science;2023-05-09