Abstract
The development of object detection in infrared images has attracted more attention in recent years. However, there are few studies on multi-scale object detection in infrared street scene images. Additionally, the lack of high-quality infrared datasets hinders research into such algorithms. In order to solve these issues, we firstly make a series of modifications based on Faster Region-Convolutional Neural Network (R-CNN). In this paper, a double-layer region proposal network (RPN) is proposed to predict proposals of different scales on both fine and coarse feature maps. Secondly, a multi-scale pooling module is introduced into the backbone of the network to explore the response of objects on different scales. Furthermore, the inception4 module and the position sensitive region of interest (ROI) align (PSalign) pooling layer are utilized to explore richer features of the objects. Thirdly, this paper proposes instance level data augmentation, which takes into account the imbalance between categories while enlarging dataset. In the training stage, the online hard example mining method is utilized to further improve the robustness of the algorithm in complex environments. The experimental results show that, compared with baseline, our detection method has state-of-the-art performance.
Funder
National Natural Science Foundation of China
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Cited by
14 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献