Author:
Lian Dongze,Zhang Ziheng,Luo Weixin,Hu Lina,Wu Minye,Li Zechao,Yu Jingyi,Gao Shenghua
Abstract
This paper tackles RGBD based gaze estimation with Convolutional Neural Networks (CNNs). Specifically, we propose to decompose gaze point estimation into eyeball pose, head pose, and 3D eye position estimation. Compared with RGB image-based gaze tracking, having depth modality helps to facilitate head pose estimation and 3D eye position estimation. The captured depth image, however, usually contains noise and black holes which noticeably hamper gaze tracking. Thus we propose a CNN-based multi-task learning framework to simultaneously refine depth images and predict gaze points. We utilize a generator network for depth image generation with a Generative Neural Network (GAN), where the generator network is partially shared by both the gaze tracking network and GAN-based depth synthesizing. By optimizing the whole network simultaneously, depth image synthesis improves gaze point estimation and vice versa. Since the only existing RGBD dataset (EYEDIAP) is too small, we build a large-scale RGBD gaze tracking dataset for performance evaluation. As far as we know, it is the largest RGBD gaze dataset in terms of the number of participants. Comprehensive experiments demonstrate that our method outperforms existing methods by a large margin on both our dataset and the EYEDIAP dataset.
Publisher
Association for the Advancement of Artificial Intelligence (AAAI)
Cited by
20 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Cascaded learning with transformer for simultaneous eye landmark, eye state and gaze estimation;Pattern Recognition;2024-12
2. Real-time eye tracking using representation learning and regression;Proceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD);2024-01-04
3. Automatic Gaze Analysis: A Survey of Deep Learning Based Approaches;IEEE Transactions on Pattern Analysis and Machine Intelligence;2024-01
4. Transfer the global knowledge for current gaze estimation;Multimedia Tools and Applications;2023-11-08
5. Be Real in Scale: Swing for True Scale in Dual Camera Mode;2023 IEEE International Symposium on Mixed and Augmented Reality (ISMAR);2023-10-16