Attention-guided Multi-modality Interaction Network for RGB-D Salient Object Detection
-
Published:2023-10-23
Issue:3
Volume:20
Page:1-22
-
ISSN:1551-6857
-
Container-title:ACM Transactions on Multimedia Computing, Communications, and Applications
-
language:en
-
Short-container-title:ACM Trans. Multimedia Comput. Commun. Appl.
Author:
Wang Ruimin1ORCID,
Wang Fasheng2ORCID,
Su Yiming2ORCID,
Sun Jing2ORCID,
Sun Fuming2ORCID,
Li Haojie3ORCID
Affiliation:
1. School of Information and Communication Engineering, Dalian Minzu University, China
2. Dalian Minzu University, China
3. Dalian University of Technology, China
Abstract
The past decade has witnessed great progress in RGB-D salient object detection (SOD). However, there are two bottlenecks that limit its further development. The first one is low-quality depth maps. Most existing methods directly use raw depth maps to perform detection, but low-quality depth images can bring negative impacts to the detection performance. Hence, it is not desirable to utilize depth maps indiscriminately. The other one is how to effectively predict salient maps with clear boundary and complete salient region. To address these problems, an Attention-Guided Multi-Modality Interaction Network (AMINet) is proposed. First, we propose a new quality enhancement strategy for unreliable depth images, named
D
epth
E
nhancement
M
odule (
DEM
). With respect to the second issue, we propose
C
ross-
M
odality
A
ttention
M
odule (
CMAM
) to rapidly locate salient region. The
B
oundary-
A
ware
M
odule (
BAM
) is designed to utilize high-level feature to guide the low-level feature generation in a top-down way to make up for the dilution of the boundary. To further improve the accuracy, we propose
A
trous
R
efined
B
lock (
ARB
) to adaptively compensate for the shortcoming of atrous convolution. By integrating these interactive modules, features from depth and RGB streams can be refined efficiently, which consequently boosts the detection performance. Experimental results demonstrate the proposed AMINet exceeds state-of-the-art (SOTA) methods on several public RGB-D datasets.
Funder
National Natural Science Foundation of China
LiaoNing Revitalization Talents Program
Liaoning Baiqianwan Talents Program
Innovative Talents Program for Liaoning Universities
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Networks and Communications,Hardware and Architecture
Reference89 articles.
1. Frequency-tuned salient region detection
2. Salient object detection: A survey
3. Salient Object Detection: A Benchmark
4. Dynamic message propagation network for RGB-D and video salient object detection;Chen Baian;ACM Trans. Multim. Comput., Commun. Applic.,2023
5. Depth-Quality-Aware Salient Object Detection