Abstract
AbstractAdvances in virtual and augmented reality have increased the demand for immersive and engaging 3D experiences. To create such experiences, it is crucial to understand visual attention in 3D environments, which is typically modeled by means of saliency maps. While attention in 2D images and traditional media has been widely studied, there is still much to explore in 3D settings. In this work, we propose a deep learning-based model for predicting saliency when viewing 3D objects, which is a first step toward understanding and predicting attention in 3D environments. Previous approaches rely solely on low-level geometric cues or unnatural conditions, however, our model is trained on a dataset of real viewing data that we have manually captured, which indeed reflects actual human viewing behavior. Our approach outperforms existing state-of-the-art methods and closely approximates the ground-truth data. Our results demonstrate the effectiveness of our approach in predicting attention in 3D objects, which can pave the way for creating more immersive and engaging 3D experiences.
Publisher
Springer Science and Business Media LLC
Reference43 articles.
1. Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998)
2. Koch, Christof, Ullman, Shimon: Shifts in selective visual attention: towards the underlying neural circuitry. Hum. Neurobiol. 4(4), 219–27 (1985)
3. Sun, Wanjie, Chen, Zhenzhong, Feng, Wu.: Visual scanpath prediction using IOR–ROI recurrent mixture density network. IEEE Trans. Pattern Anal. Mach. Intell. 43(6), 2101–2118 (2019)
4. Chen, X., Jiang, M., Zhao, Q.: Predicting human scanpaths in visual question answering. In: Proceedings of Computer Vision and Pattern Recognition (CVPR), pp. 10876–10885 (2021)
5. Martin, D., Gutierrez, D., Masia, B.: A probabilistic time-evolving approach to scanpath prediction (2022). arXiv preprint arXiv:2204.09404