Abstract
Infrared target detection is widely used in industrial fields, such as environmental monitoring, automatic driving, etc., and the detection of weak targets is one of the most challenging research topics in this field. Due to the small size of these targets, limited information and less surrounding contextual information, it increases the difficulty of target detection and recognition. To address these issues, this paper proposes YOLO-ISTD, an improved method for infrared small target detection based on the YOLOv5-S framework. Firstly, we propose a feature extraction module called SACSP, which incorporates the Shuffle Attention mechanism and makes certain adjustments to the CSP structure, enhancing the feature extraction capability and improving the performance of the detector. Secondly, we introduce a feature fusion module called NL-SPPF. By introducing an NL-Block, the network is able to capture richer long-range features, better capturing the correlation between background information and targets, thereby enhancing the detection capability for small targets. Lastly, we propose a modified K-means clustering algorithm based on Distance-IoU (DIoU), called K-means_DIOU, to improve the accuracy of clustering and generate anchors suitable for the task. Additionally, modifications are made to the detection heads in YOLOv5-S. The original 8, 16, and 32 times downsampling detection heads are replaced with 4, 8, and 16 times downsampling detection heads, capturing more informative coarse-grained features. This enables better understanding of the overall characteristics and structure of the targets, resulting in improved representation and localization of small targets. Experimental results demonstrate significant achievements of YOLO-ISTD on the NUST-SIRST dataset, with an improvement of 8.568% in mAP@0.5 and 8.618% in mAP@0.95. Compared to the comparative models, the proposed approach effectively addresses issues of missed detections and false alarms in the detection results, leading to substantial improvements in precision, recall, and model convergence speed.
Publisher
Public Library of Science (PLoS)