Author:
Yu Yinhui,Sun Xu,Cheng Qing
Abstract
AbstractDespite the remarkable progress of general object detection, the lack of labeled aerial images limits the robustness and generalization of the detector. Teacher–student learning is a feasible solution on natural image domain, but few works focus on unlabeled aerial images. Inspired by foundation models with the powerful generalization in computer vision field, we propose an expert teacher framework based on foundation image segmentation model called ET-FSM. Our approach provides the performance gains for the student detector by generating high-quality pseudo-labels for unlabeled aerial images. In the ET-FSM, we design the binary detector with expert guidance mechanism to sufficiently leverage the extra knowledge obtained from the foundation image segmentation model, which accurately detects object positions in the complex backgrounds. Also, we present the momentum contrast classification module to distinguish confused object categories in aerial images. To demonstrate the effectiveness of the proposed method, we construct an unlabeled aerial image dataset covering various scenes. The experiments are conducted on diverse types of student detectors. The results show that the proposed approach achieves superior performance compared to existing methods, and allows the student detector to achieve fully supervised performance with much less labeled aerial images. Our dataset and code are available at https://github.com/cq100/ET-FSM.
Funder
National Natural Science Foundation of China
Publisher
Springer Science and Business Media LLC
Reference46 articles.
1. Heidari, A., Navimipour, N. J., Unal, M. & Hang, G. Machine learning applications in internet-of-drones: Systematic review, recent deployments, and open issues. ACM Comput. Surv. 55(12), 1–45 (2023).
2. Santhana, K. B. et al. Fusion of visible and thermal images improves automated detection and classifcation of animals for drone surveys. Sci. Rep. 13, 1–12 (2023).
3. Ding, J. et al. Object detection in aerial images: A large-scale benchmark and challenges. IEEE Trans. Pattern Anal. Mach. Intell. 44(11), 7778–7796 (2022).
4. Wang, W., Chen, Y. & Ghamisi, P. Transferring CNN with adaptive learning for remote sensing scene classification. IEEE Trans. Geosci. Remote Sens. 60, 1–18 (2022).
5. Kumar, T., Mileo, A., Brennan, R. & Bendechache, M. Image data augmentation approaches: A comprehensive survey and future directions. Preprint at arXiv:2301.02830. (2023).