Weakly Supervised Semantic Segmentation in Aerial Imagery via Cross-Image Semantic Mining
-
Published:2023-02-10
Issue:4
Volume:15
Page:986
-
ISSN:2072-4292
-
Container-title:Remote Sensing
-
language:en
-
Short-container-title:Remote Sensing
Author:
Zhou Ruixue123ORCID, Yuan Zhiqiang123ORCID, Rong Xuee123, Ma Weicong4, Sun Xian13, Fu Kun13, Zhang Wenkai13
Affiliation:
1. Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100190, China 2. School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100190, China 3. Key Laboratory of Network Information System Technology (NIST), Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100190, China 4. Xi’an Institute of Applied Optics, Xi’an 710065, China
Abstract
Weakly Supervised Semantic Segmentation (WSSS) with only image-level labels reduces the annotation burden and has been rapidly developed in recent years. However, current mainstream methods only employ a single image’s information to localize the target and do not account for the relationships across images. When faced with Remote Sensing (RS) images, limited to complex backgrounds and multiple categories, it is challenging to locate and differentiate between the categories of targets. As opposed to previous methods that mostly focused on single-image information, we propose CISM, a novel cross-image semantic mining WSSS framework. CISM explores cross-image semantics in multi-category RS scenes for the first time with two novel loss functions: the Common Semantic Mining (CSM) loss and the Non-common Semantic Contrastive (NSC) loss. In particular, prototype vectors and the Prototype Interactive Enhancement (PIE) module were employed to capture semantic similarity and differences across images. To overcome category confusions and closely related background interferences, we integrated the Single-Label Secondary Classification (SLSC) task and the corresponding single-label loss into our framework. Furthermore, a Multi-Category Sample Generation (MCSG) strategy was devised to balance the distribution of samples among various categories and drastically increase the diversity of images. The above designs facilitated the generation of more accurate and higher-granularity Class Activation Maps (CAMs) for each category of targets. Our approach is superior to the RS dataset based on extensive experiments and is the first WSSS framework to explore cross-image semantics in multi-category RS scenes and obtain cutting-edge state-of-the-art results on the iSAID dataset by only using image-level labels. Experiments on the PASCAL VOC2012 dataset also demonstrated the effectiveness and competitiveness of the algorithm, which pushes the mean Intersection-Over-Union (mIoU) to 67.3% and 68.5% on the validation and test sets of PASCAL VOC2012, respectively.
Funder
National Science Fund for Distinguished Young Scholars Surface of the State Natural Science Fund projects
Subject
General Earth and Planetary Sciences
Reference64 articles.
1. Chen, L.-C., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. arXiv. 2. Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. (2017, January 21–26). Pyramid scene parsing network. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA. 3. Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. (2018, January 8–14). Encoder-decoder with atrous separable convolution for semantic image segmentation. Proceedings of the Computer Vision—ECCV 2018, Munich, Germany. 4. Pu, M., Huang, Y., Guan, Q., and Qi, Z. (2018, January 22–26). Graphnet: Learning image pseudo annotations for weakly supervised semantic segmentation. Proceedings of the 2018 ACM Multimedia Conference, Seoul, Republic of Korea. 5. Vernaza, P., and Chandraker, M. (2017, January 21–26). Learning random-walk label propagation for weakly supervised semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|