Affiliation:
1. School of Computer Science and Technology Changchun University of Science and Technology Changchun China
2. Machine Vision and Unmanned Systems Laboratory Zhongshan Institute of Changchun University of Science and Technology Zhongshan China
3. School of Artificial Intelligence Changchun University of Science and Technology Changchun China
Abstract
AbstractIn recent years, research on infrared and visible image fusion has mainly focused on deep learning‐based approaches, particularly deep neural networks with auto‐encoder architectures. However, these approaches suffer from problems such as insufficient feature extraction capability and inefficient fusion strategies. Therefore, this paper introduces a novel image fusion network to address the limitations of infrared and visible image fusion networks with auto‐encoder architectures. In the designed network, the encoder employs a multi‐branch cascade structure, and these convolution branches with different kernel sizes provide the encoder with an adaptive receptive field to extract multi‐scale features. In addition, the fusion layer incorporates a non‐local attention module that is inspired by the self‐attention mechanism. With its global receptive field, this module is used to build a non‐local attention fusion network, which works together with the ‐norm spatial fusion strategy to extract, split, filter, and fuse global and local features. Comparative experiments on the TNO and MSRS datasets demonstrate that the proposed method outperforms other state‐of‐the‐art fusion approaches.
Publisher
Institution of Engineering and Technology (IET)