IRSTFormer: A Hierarchical Vision Transformer for Infrared Small Target Detection

Author:

Chen GaoORCID,Wang Weihua,Tan Sirui

Abstract

Infrared small target detection occupies an important position in the infrared search and track system. The most common size of infrared images has developed to 640×512. The field-of-view (FOV) also increases significantly. As the result, there is more interference that hinders the detection of small targets in the image. However, the traditional model-driven methods do not have the capability of feature learning, resulting in poor adaptability to various scenes. Owing to the locality of convolution kernels, recent convolutional neural networks (CNN) cannot model the long-range dependency in the image to suppress false alarms. In this paper, we propose a hierarchical vision transformer-based method for infrared small target detection in larger size and FOV images of 640×512. Specifically, we design a hierarchical overlapped small patch transformer (HOSPT), instead of the CNN, to encode multi-scale features from the single-frame image. For the decoder, a top-down feature aggregation module (TFAM) is adopted to fuse features from adjacent scales. Furthermore, after analyzing existing loss functions, a simple yet effective combination is exploited to optimize the network convergence. Compared to other state-of-the-art methods, the normalized intersection-over-union (nIoU) on our IRST640 dataset and public SIRST dataset reaches 0.856 and 0.758. The detailed ablation experiments are conducted to validate the effectiveness and reasonability of each component in the method.

Publisher

MDPI AG

Subject

General Earth and Planetary Sciences

Reference47 articles.

1. Adaptive sequential algorithms for detecting targets in a heavy IR clutter;Tartakovsky;Proceedings of the Signal and Data Processing of Small Targets 1999,1999

2. Robust Infrared Small Target Detection Using Multiscale Gray and Variance Difference Measures

3. Infrared maritime dim small target detection based on spatiotemporal cues and directional morphological filtering

4. Morphology-based algorithm for point target detection in infrared backgrounds;Tom;Proceedings of the Signal and Data Processing of Small Targets,1993

5. Max-mean and max-median filters for detection of small targets;Deshpande;Proceedings of the Signal and Data Processing of Small Targets,1999

Cited by 23 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3