Affiliation:
1. School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
2. Beijing Key Laboratory for Precision Optoelectronic Measurement Instrument and Technology, Beijing Institute of Technology, Beijing 100081, China
Abstract
Drone object detection faces numerous challenges such as dense clusters with overlapping, scale diversity, and long-tail distributions. Utilizing tiling inference through uniform sliding window is an effective way of enlarging tiny objects and meanwhile efficient for real-world applications. However, merely partitioning input images may result in heavy truncation and an unexpected performance drop in large objects. Therefore, in this work, we strive to develop an improved tiling detection framework with both competitive performance and high efficiency. First, we formulate the tiling inference and training pipeline with a mixed data strategy. To avoid truncation and handle objects at all scales, we simultaneously perform global detection on the original image and local detection on corresponding sub-patches, employing appropriate patch settings. Correspondingly, the training data includes both original images and the patches generated by random online anchor-cropping, which can ensure the effectiveness of patches and enrich the image scenarios. Furthermore, a scale filtering mechanism is applied to assign objects at diverse scales to global and local detection tasks to keep the scale invariance of a detector and obtain optimal fused predictions. As most of the additional operations are performed in parallel, the tiling inference remains highly efficient. Additionally, we devise two augmentations customized for tiling detection to effectively increase valid annotations, which can generate more challenging drone scenarios and simulate the practical cluster with overlapping, especially for rare categories. Comprehensive experiments on both public drone benchmarks and our customized real-world images demonstrate that, in comparison to other drone detection frameworks, the proposed tiling framework can significantly improve the performance of general detectors in drone scenarios with lower additional computational costs.
Funder
National Natural Science Foundation of China General Program
National Natural Science Foundation of China Key Program
Subject
General Earth and Planetary Sciences
Reference64 articles.
1. Half a Percent of Labels is Enough: Efficient Animal Detection in UAV Imagery Using Deep CNNs and Active Learning;Kellenberger;IEEE Trans. Geosci. Remote Sens.,2019
2. Earthquake Crack Detection From Aerial Images Using a Deformable Convolutional Neural Network;Yu;IEEE Trans. Geosci. Remote Sens.,2022
3. Yao, H., Qin, R., and Chen, X. (2019). Unmanned Aerial Vehicle for Remote Sensing Applications—A Review. Remote Sens., 11.
4. Faster R-CNN: Towards real-time object detection with region proposal networks;Ren;IEEE Trans. Pattern Anal. Mach. Intell.,2016
5. Tian, Z., Shen, C., Chen, H., and He, T. (2019–2, January 27). Fcos: Fully convolutional one-stage object detection. Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献