Affiliation:
1. School of Information and Electronic Engineering, Zhejiang University of Science & Technology, Hangzhou 310023, China
2. Institute of Information and Communication Engineering, Zhejiang University, Hangzhou 310027, China
Abstract
In recent years, the prediction of salient regions in RGB-D images has become a focus of research. Compared to its RGB counterpart, the saliency prediction of RGB-D images is more challenging. In this study, we propose a novel deep multimodal fusion autoencoder for the saliency prediction of RGB-D images. The core trainable autoencoder of the RGB-D saliency prediction model employs two raw modalities (RGB and depth/disparity information) as inputs and their corresponding eye-fixation attributes as labels. The autoencoder comprises four main networks: color channel network, disparity channel network, feature concatenated network, and feature learning network. The autoencoder can mine the complex relationship and make the utmost of the complementary characteristics between both color and disparity cues. Finally, the saliency map is predicted via a feature combination subnetwork, which combines the deep features extracted from a prior learning and convolutional feature learning subnetworks. We compare the proposed autoencoder with other saliency prediction models on two publicly available benchmark datasets. The results demonstrate that the proposed autoencoder outperforms these models by a significant margin.
Funder
National Natural Science Foundation of China
Subject
General Mathematics,General Medicine,General Neuroscience,General Computer Science
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Using Convolutional Neural Networks for the Assessment Research of Mental Health;Computational Intelligence and Neuroscience;2022-05-09
2. Dynamic Invariant-Specific Representation Fusion Network for Multimodal Sentiment Analysis;Computational Intelligence and Neuroscience;2022-01-24
3. Deep Tensor Evidence Fusion Network for Sentiment Classification;IEEE Transactions on Computational Social Systems;2022
4. Robot Localization and Scene Modeling Based on RGB-D Sensor;The 2021 International Conference on Machine Learning and Big Data Analytics for IoT Security and Privacy;2021-10-28