DroneNet: Rescue Drone-View Object Detection
Author:
Wang Xiandong1, Yao Fengqin1, Li Ankun2, Xu Zhiwei3, Ding Laihui3, Yang Xiaogang3, Zhong Guoqiang1ORCID, Wang Shengke1
Affiliation:
1. Faculty of Information Science and Engineering, Ocean University of China, Qingdao 266100, China 2. Shandong Artificial Intelligence Institute, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250014, China 3. Shandong Willand Intelligent Technology Co., Ltd., Qingdao 266100, China
Abstract
Recently, the research on drone-view object detection (DOD) has predominantly centered on efficiently identifying objects through cropping high-resolution images. However, it has overlooked the distinctive challenges posed by scale imbalance and a higher prevalence of small objects in drone images. In this paper, to address the challenges associated with the detection of drones (DODs), we introduce a specialized detector called DroneNet. Firstly, we propose a feature information enhancement module (FIEM) that effectively preserves object information and can be seamlessly integrated as a plug-and-play module into the backbone network. Then, we propose a split-concat feature pyramid network (SCFPN) that not only fuses feature information from different scales but also enables more comprehensive exploration of feature layers with many small objects. Finally, we develop a coarse to refine label assign (CRLA) strategy for small objects, which assigns labels from coarse- to fine-grained levels and ensures adequate training of small objects during the training process. In addition, to further promote the development of DOD, we introduce a new dataset named OUC-UAV-DET. Extensive experiments on VisDrone2021, UAVDT, and OUC-UAV-DET demonstrate that our proposed detector, DroneNet, exhibits significant improvements in handling challenging targets, outperforming state-of-the-art detectors.
Subject
Artificial Intelligence,Computer Science Applications,Aerospace Engineering,Information Systems,Control and Systems Engineering
Reference55 articles.
1. Wang, J., Zhang, S., Liu, Y., Wu, T., Yang, Y., Liu, X., Chen, K., Luo, P., and Lin, D. (2023, January 18–22). RIFormer: Keep Your Vision Backbone Effective But Removing Token Mixer. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada. 2. Woo, S., Debnath, S., Hu, R., Chen, X., Liu, Z., Kweon, I.S., and Xie, S. (2023). ConvNeXt V2: Co-Designing and Scaling ConvNets with Masked Autoencoders. arXiv. 3. Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., and Zitnick, C.L. (2014, January 6–12). Microsoft coco: Common objects in context. Proceedings of the Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland. Part V 13. 4. Umair, M., Farooq, M.U., Raza, R.H., Chen, Q., and Abdulhai, B. (2021). Efficient video-based vehicle queue length estimation using computer vision and deep learning for an urban traffic scenario. Processes, 9. 5. Singh, C.H., Mishra, V., Jain, K., and Shukla, A.K. (2022). FRCNN-Based Reinforcement Learning for Real-Time Vehicle Detection, Tracking and Geolocation from UAS. Drones, 6.
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|