Lightweight Cross-Modal Information Mutual Reinforcement Network for RGB-T Salient Object Detection
Author:
Lv Chengtao1ORCID, Wan Bin1, Zhou Xiaofei1ORCID, Sun Yaoqi12, Zhang Jiyong1ORCID, Yan Chenggang1
Affiliation:
1. School of Automation, Hangzhou Dianzi University, Hangzhou 310018, China 2. Lishui Institute, Hangzhou Dianzi University, Lishui 323000, China
Abstract
RGB-T salient object detection (SOD) has made significant progress in recent years. However, most existing works are based on heavy models, which are not applicable to mobile devices. Additionally, there is still room for improvement in the design of cross-modal feature fusion and cross-level feature fusion. To address these issues, we propose a lightweight cross-modal information mutual reinforcement network for RGB-T SOD. Our network consists of a lightweight encoder, the cross-modal information mutual reinforcement (CMIMR) module, and the semantic-information-guided fusion (SIGF) module. To reduce the computational cost and the number of parameters, we employ the lightweight module in both the encoder and decoder. Furthermore, to fuse the complementary information between two-modal features, we design the CMIMR module to enhance the two-modal features. This module effectively refines the two-modal features by absorbing previous-level semantic information and inter-modal complementary information. In addition, to fuse the cross-level feature and detect multiscale salient objects, we design the SIGF module, which effectively suppresses the background noisy information in low-level features and extracts multiscale information. We conduct extensive experiments on three RGB-T datasets, and our method achieves competitive performance compared to the other 15 state-of-the-art methods.
Funder
Zhejiang Province Key Research and Development Program of China Zhejiang Province Nature Science Foundation of China National Natural Science Foundation of China “Pioneer” and “Leading Goose” R&D Program of Zhejiang Province 111 Project Fundamental Research Funds for the Provincial Universities of Zhejiang
Reference73 articles.
1. Liu, H., Ma, M., Wang, M., Chen, Z., and Zhao, Y. (2023). SCFusion: Infrared and Visible Fusion Based on Salient Compensation. Entropy, 25. 2. Cui, X., Peng, Z., Jiang, G., Chen, F., and Yu, M. (2019). Perceptual Video Coding Scheme Using Just Noticeable Distortion Model Based on Entropy Filter. Entropy, 21. 3. Wang, W., Wang, J., and Chen, J. (2021). Adaptive Block-Based Compressed Video Sensing Based on Saliency Detection and Side Information. Entropy, 23. 4. Guan, X., He, L., Li, M., and Li, F. (2020). Entropy Based Data Expansion Method for Blind Image Quality Assessment. Entropy, 22. 5. Gradient-based learning applied to document recognition;Lecun;Proc. IEEE,1998
|
|