BCMFIFuse: A Bilateral Cross-Modal Feature Interaction-Based Network for Infrared and Visible Image Fusion-Reference-Cited by-同舟云学术

BCMFIFuse: A Bilateral Cross-Modal Feature Interaction-Based Network for Infrared and Visible Image Fusion

Published:2024-08-25 Issue:17 Volume:16 Page:3136
ISSN:2072-4292
Container-title:Remote Sensing
language:en
Short-container-title:Remote Sensing

Author:

Gao Xueyan¹^ORCID,Liu Shiguang¹^ORCID

Affiliation:

1. College of Intelligence and Computing, Tianjin University, Tianjin 300350, China

Abstract

The main purpose of infrared and visible image fusion is to produce a fusion image that incorporates less redundant information while incorporating more complementary information, thereby facilitating subsequent high-level visual tasks. However, obtaining complementary information from different modalities of images is a challenge. Existing fusion methods often consider only relevance and neglect the complementarity of different modalities’ features, leading to the loss of some cross-modal complementary information. To enhance complementary information, it is believed that more comprehensive cross-modal interactions should be provided. Therefore, a fusion network for infrared and visible fusion is proposed, which is based on bilateral cross-feature interaction, termed BCMFIFuse. To obtain features in images of different modalities, we devise a two-stream network. During the feature extraction, a cross-modal feature correction block (CMFC) is introduced, which calibrates the current modality features by leveraging feature correlations from different modalities in both spatial and channel dimensions. Then, a feature fusion block (FFB) is employed to effectively integrate cross-modal information. The FFB aims to explore and integrate the most discriminative features from the infrared and visible image, enabling long-range contextual interactions to enhance global cross-modal features. In addition, to extract more comprehensive multi-scale features, we develop a hybrid pyramid dilated convolution block (HPDCB). Comprehensive experiments on different datasets reveal that our method performs excellently in qualitative, quantitative, and object detection evaluations.

Funder

National Natural Science Foundation of China

Publisher

MDPI AG

Link

https://www.mdpi.com/2072-4292/16/17/3136/pdf

Reference62 articles.

1. An interactively reinforced paradigm for joint infrared-visible image fusion and saliency object detection;Wang;Inf. Fusion,2023

2. AT-GAN: A generative adversarial network with attention and transition for infrared and visible image fusion;Rao;Inf. Fusion,2023

3. Wei, Q., Liu, Y., Jiang, X., Zhang, B., Su, Q., and Yu, M. (2024). DDFNet-A: Attention-Based Dual-Branch Feature Decomposition Fusion Network for Infrared and Visible Image Fusion. Remote Sens., 16.

4. A Dual-Domain Super-Resolution Image Fusion Method with SIRV and GALCA Model for PolSAR and Panchromatic Images;Liu;IEEE Trans. Geosci. Remote Sens.,2021

5. Liu, J., Fan, X., Huang, Z., Wu, G., Liu, R., Zhong, W., and Luo, Z. (2022, January 18–24). Target-aware dual adversarial learning and a multi-scenario multi-modality benchmark to fuse infrared and visible for object detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.