Attention-guided Multi-modality Interaction Network for RGB-D Salient Object Detection-Reference-Cited by-同舟云学术

Attention-guided Multi-modality Interaction Network for RGB-D Salient Object Detection

Published:2023-10-23 Issue:3 Volume:20 Page:1-22
ISSN:1551-6857
Container-title:ACM Transactions on Multimedia Computing, Communications, and Applications
language:en
Short-container-title:ACM Trans. Multimedia Comput. Commun. Appl.

Author:

Wang Ruimin¹^ORCID,Wang Fasheng²^ORCID,Su Yiming²^ORCID,Sun Jing²^ORCID,Sun Fuming²^ORCID,Li Haojie³^ORCID

Affiliation:

1. School of Information and Communication Engineering, Dalian Minzu University, China

2. Dalian Minzu University, China

3. Dalian University of Technology, China

Abstract

The past decade has witnessed great progress in RGB-D salient object detection (SOD). However, there are two bottlenecks that limit its further development. The first one is low-quality depth maps. Most existing methods directly use raw depth maps to perform detection, but low-quality depth images can bring negative impacts to the detection performance. Hence, it is not desirable to utilize depth maps indiscriminately. The other one is how to effectively predict salient maps with clear boundary and complete salient region. To address these problems, an Attention-Guided Multi-Modality Interaction Network (AMINet) is proposed. First, we propose a new quality enhancement strategy for unreliable depth images, named D epth E nhancement M odule ( DEM ). With respect to the second issue, we propose C ross- M odality A ttention M odule ( CMAM ) to rapidly locate salient region. The B oundary- A ware M odule ( BAM ) is designed to utilize high-level feature to guide the low-level feature generation in a top-down way to make up for the dilution of the boundary. To further improve the accuracy, we propose A trous R efined B lock ( ARB ) to adaptively compensate for the shortcoming of atrous convolution. By integrating these interactive modules, features from depth and RGB streams can be refined efficiently, which consequently boosts the detection performance. Experimental results demonstrate the proposed AMINet exceeds state-of-the-art (SOTA) methods on several public RGB-D datasets.

Funder

National Natural Science Foundation of China

LiaoNing Revitalization Talents Program

Liaoning Baiqianwan Talents Program

Innovative Talents Program for Liaoning Universities

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications,Hardware and Architecture

Link

https://dl.acm.org/doi/pdf/10.1145/3624747

Reference89 articles.

1. Frequency-tuned salient region detection

2. Salient object detection: A survey

3. Salient Object Detection: A Benchmark

4. Dynamic message propagation network for RGB-D and video salient object detection;Chen Baian;ACM Trans. Multim. Comput., Commun. Applic.,2023

5. Depth-Quality-Aware Salient Object Detection