Author:
Liu Baodi,Liu Yan,Shao Qianqian,Liu Weifeng
Abstract
AbstractIn recent decades, the development of multimedia and computer vision has sparked significant interest among researchers in the field of automatic image annotation. However, much of the research has primarily focused on using a single graph for annotating images in semi-supervised learning. Conversely, numerous approaches have explored the integration of multi-view or image segmentation techniques to create multiple graph structures. Yet, relying solely on a single graph proves to be challenging, as it struggles to capture the complete manifold of structural information. Furthermore, the computational complexity of building multiple graph structures based on multi-view or image segmentation is substantial and time-consuming. To address these issues, we propose a novel method called "Central Attention with Multi-graphs for Image Annotation." Our approach emphasizes the critical role of the central image region in the annotation process. Remarkably, we demonstrate that impressive performance can be achieved by leveraging just two graph structures, composed of central and overall features, in semi-supervised learning. To validate the effectiveness of our proposed method, we conducted a series of experiments on benchmark datasets, including Corel5K, ESPGame, and IAPRTC12. These experiments provide empirical evidence of our method’s capabilities.
Publisher
Springer Science and Business Media LLC
Reference54 articles.
1. Bakliwal P and Jawahar CV (2015) Active learning based image annotation. In: 2015 Fifth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), pp 1–4. IEEE
2. Belkin M, Niyogi P, and Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434
3. Bhagat PK, Choudhary P (2018) Image annotation: then and now. Image Vision Comput 80:1–23
4. Chen C, Li S, Wang Y, Qin H, Hao A (2017) Video saliency detection via spatial-temporal fusion and low-rank coherency diffusion. IEEE Trans Image Process 26(7):3156–3170
5. Chen C, Wang G, Peng C, Fang Y, Zhang D, Qin H (2021) Exploring rich and efficient spatial temporal interactions for real-time video salient object detection. IEEE Trans Image Process 30:3995–4007