Inferring Attention Shifts for Salient Instance Ranking-Reference-Cited by-同舟云学术

Inferring Attention Shifts for Salient Instance Ranking

Published:2023-10-18 Issue: Volume: Page:
ISSN:0920-5691
Container-title:International Journal of Computer Vision
language:en
Short-container-title:Int J Comput Vis

Author:

Siris Avishek^ORCID,Jiao Jianbo,Tam Gary K. L.,Xie Xianghua,Lau Rynson W. H.

Abstract

AbstractThe human visual system has limited capacity in simultaneously processing multiple visual inputs. Consequently, humans rely on shifting their attention from one location to another. When viewing an image of complex scenes, psychology studies and behavioural observations show that humans prioritise and sequentially shift attention among multiple visual stimuli. In this paper, we propose to predict the saliency rank of multiple objects by inferring human attention shift. We first construct a new large-scale salient object ranking dataset, with the saliency rank of objects defined by the order that an observer attends to these objects via attention shift. We then propose a new deep learning-based model to leverage both bottom-up and top-down attention mechanisms for saliency rank prediction. Our model includes three novel modules: Spatial Mask Module (SMM), Selective Attention Module (SAM) and Salient Instance Edge Module (SIEM). SMM integrates bottom-up and semantic object properties to enhance contextual object features, from which SAM learns the dependencies between object features and image features for saliency reasoning. SIEM is designed to improve segmentation of salient objects, which helps further improve their rank predictions. Experimental results show that our proposed network achieves state-of-the-art performances on the salient object ranking task across multiple datasets. Code and data are available at https://github.com/SirisAvishek/Attention_Shift_Ranks.

Funder

Swansea Science DTC Postgraduate Research Scholarship

Engineering and Physical Sciences Research Council

Royal Society

Research Grants Council of Hong Kong

Strategic Research Grant from City University of Hong Kong

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Computer Vision and Pattern Recognition,Software

Link

https://link.springer.com/content/pdf/10.1007/s11263-023-01906-7.pdf

Reference105 articles.

1. Abdulla, W. (2017). Mask r-cnn for object detection and instance segmentation on keras and tensorflow. https://github.com/matterport/Mask_RCNN.

2. Anderson, P., He, X., Buehler, C., Teney, D., Johnson, M., Gould, S., & Zhang, L. (2018). Bottom-up and top-down attention for image captioning and visual question answering. In CVPR, pp. 6077–6086.

3. Arvanitis, G., Stagakis, N., Zacharaki, E. I., & Moustakas, K. (2023). Cooperative saliency-based obstacle detection and ar rendering for increased situational awareness. arXiv preprint arXiv:2302.00916.

4. Borji, A. (2012). Boosting bottom-up and top-down visual features for saliency estimation. In CVPR, pp. 438–445.

5. Borji, A. (2018). Saliency prediction in the deep learning era: Successes, limitations, and future challenges. arXiv preprint arXiv:1810.03716.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Salient Object Ranking: Saliency Model on Relativity Learning and Evaluation Metric on Triple Accuracy;2024