HDCCT: Hybrid Densely Connected CNN and Transformer for Infrared and Visible Image Fusion-Reference-Cited by-同舟云学术

HDCCT: Hybrid Densely Connected CNN and Transformer for Infrared and Visible Image Fusion

Published:2024-08-31 Issue:17 Volume:13 Page:3470
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Li Xue¹^ORCID,He Hui²,Shi Jin³

Affiliation:

1. School of Rail Transportation, Shandong Jiaotong University, Jinan 250357, China

2. State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing 100044, China

3. CRSC Research and Design Institute Group Co., Ltd., Beijing 100070, China

Abstract

Multi-modal image fusion is a methodology that combines image features from multiple types of sensors, effectively improving the quality and content of fused images. However, most existing deep learning fusion methods need to integrate global or local features, restricting the representation of feature information. To address this issue, a hybrid densely connected CNN and transformer (HDCCT) fusion framework is proposed. In the proposed HDCCT framework, the network of the CNN-based blocks obtain the local structure of the input data, and the transformer-based blocks obtain the global structure of the original data, significantly improving the feature representation. In the fused image, the proposed encoder–decoder architecture is designed for both the CNN and transformer blocks to reduce feature loss while preserving the characterization of all-level features. In addition, the cross-coupled framework facilitates the flow of feature structures, retains the uniqueness of information, and makes the transform model long-range dependencies based on the local features already extracted by the CNN. Meanwhile, to retain the information in the source images, the hybrid structural similarity (SSIM) and mean square error (MSE) loss functions are introduced. The qualitative and quantitative comparisons of grayscale images with infrared and visible image fusion indicate that the suggested method outperforms related works.

Publisher

MDPI AG

Link

https://www.mdpi.com/2079-9292/13/17/3470/pdf

Reference54 articles.

1. Pixel-level image fusion: A survey of the state of the art;Li;Inf. Fusion,2017

2. Kumar, P., Mittal, A., and Kumar, P. (2006, January 13–16). Fusion of thermal infrared and visible spectrum video for robust surveillance. Proceedings of the Computer Vision, Graphics and Image Processing: 5th Indian Conference, ICVGIP 2006, Madurai, India.

3. Developing a spectral-based strategy for urban object detection from airborne hyperspectral TIR and visible data;Eslami;IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.,2016

4. Infrared and visible image fusion methods and applications: A survey;Ma;Inf. Fusion,2019

5. MFF-GAN: An unsupervised generative adversarial network with adaptive and gradient joint constraints for multi-focus image fusion;Zhang;Inf. Fusion,2021