Affiliation:
1. State Key Laboratory of Precision Measuring Technology and Instruments, Laboratory of Micro/Nano Manufacturing Technology (MNMT), Tianjin University , Tianjin 300072 , China
2. Centre of Micro/Nano Manufacturing Technology (MNMT-Dublin), University College Dublin , Dublin 4 , Ireland
Abstract
Abstract
Gaze estimation is a fundamental task in many applications of cognitive sciences, human–computer interaction, and robotics. The purely data-driven appearance-based gaze estimation methods may suffer from a lack of interpretability, which prevents their applicability to pervasive scenarios. In this study, a feature fusion method with multi-level information elements is proposed to improve the comprehensive performance of the appearance-based gaze estimation model. The multi-level feature extraction and expression are carried out from the originally captured images, and a multi-level information element matrix is established. A gaze conduction principle is formulated for reasonably fusing information elements from the established matrix. According to the gaze conduction principle along with the matrix, a multi-level information element fusion (MIEF) model for gaze estimation is proposed. Then, several input modes and network structures of the MIEF model are designed, and a series of grouping experiments are carried out on a small-scale sub-dataset. Furthermore, the optimized input modes and network structures of the MIEF model are selected for training and testing on the whole dataset to verify and compare model performance. Experimental results show that optimizing the feature combination in the input control module and fine-tuning the computational architecture in the feature extraction module can improve the performance of the gaze estimation model, which would enable the reduction of the model by incorporating the critical features and thus improve the performance and accessibility of the method. Compared with the reference baseline, the optimized model based on the proposed feature fusion method of multi-level information elements can achieve efficient training and improve the test accuracy in the verification experiment. The average error is 1.63 cm on phones on the GazeCapture dataset, which achieves comparable accuracy with state-of-the-art methods.
Funder
National Natural Science Foundation of China
Tianjin University
Publisher
Oxford University Press (OUP)
Subject
Computational Mathematics,Computer Graphics and Computer-Aided Design,Human-Computer Interaction,Engineering (miscellaneous),Modeling and Simulation,Computational Mechanics
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献