Co-ECL: Covariant Network with Equivariant Contrastive Learning for Oriented Object Detection in Remote Sensing Images
-
Published:2024-01-29
Issue:3
Volume:16
Page:516
-
ISSN:2072-4292
-
Container-title:Remote Sensing
-
language:en
-
Short-container-title:Remote Sensing
Author:
Zhang Yunsheng12ORCID, Ren Zijing1, Ding Zichen3, Qian Hong4, Li Haiqiang4, Tao Chao1
Affiliation:
1. School of Geosciences and Info-Physics, Central South University, Changsha 410083, China 2. Xiangjiang Laboratory, Changsha 410205, China 3. Intellectual Property Protection Center of Inner Mongolia Autonomous Region, Hohhot 010000, China 4. Inner Mongolia Tongdao Yao Digital Technology Co., Ltd., Hohhot 010000, China
Abstract
Contrastive learning allows us to learn general features for downstream tasks without the need for labeled data by leveraging intrinsic signals within remote sensing images. Existing contrastive learning methods encourage invariant feature learning by bringing positive samples defined by random transformations in feature spaces closer, where transformed samples of the same image at different intensities are considered equivalent. However, remote sensing images differ from natural images in their top-down perspective results in the arbitrary orientation of objects and in that the images contain rich in-plane rotation information. Maintaining invariance to rotation transformations can lead to the loss of rotation information in features, thereby affecting angle information predictions for differently rotated samples in downstream tasks. Therefore, we believe that contrastive learning should not focus only on strict invariance but encourage features to be equivariant to rotation while maintaining invariance to other transformations. To achieve this goal, we propose an invariant–equivariant covariant network (Co-ECL) based on collaborative and reverse mechanisms. The collaborative mechanism encourages rotation equivariance by predicting the rotation transformations of input images and combines invariant and equivariant learning tasks to jointly supervise the feature learning process to achieve collaborative learning. The reverse mechanism introduces a reverse rotation module in the feature learning stage, applying reverse rotation transformations with equal intensity to features in invariant learning tasks as in the data transformation stage, thereby ensuring their independent realization. In experiments conducted on three publicly available oriented object detection datasets of remote sensing images, our method consistently demonstrated the best performance. Additionally, these experiments on multi-angle datasets demonstrated that our method has good robustness on rotation-related tasks.
Funder
the Major Program Project of Xiangjiang Laboratory the Natural Science Foundation of Hunan for Distinguished Young Scholars
Reference45 articles.
1. Deep learning for generic object detection: A survey;Liu;Int. J. Comput. Vis.,2020 2. Girshick, R. (2015, January 7–13). Fast r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile. 3. Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23–28). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA. 4. Lin, T.Y., Goyal, P., Girshick, R., He, K., and Dollár, P. (2017, January 22–29). Focal loss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy. 5. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C. (2016, January 11–14). Ssd: Single shot multibox detector. Proceedings of the Computer Vision—ECCV 2016: 14th European Conference, Amsterdam, The Netherlands. Proceedings, Part I 14.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|