Affiliation:
1. Beijing Key Laboratory of Information Service Engineering Beijing Union University Beijing China
2. Institute for Brain and Cognitive Sciences, College of Robotics Beijing Union University Beijing China
Abstract
AbstractThe limited of texture details information in low‐resolution facial or eye images presents a challenge for gaze estimation. To address this, FSKT‐GE (feature maps similarity knowledge transfer for low‐resolution gaze estimation) is proposed, a gaze estimation framework consisting of both a high resolution (HR) network and low resolution (LR) network with the identical structure. Rather than mere feature imitation, this issue is addressed by assessing the cosine similarity of feature layers, emphasizing the distribution similarity between the HR and LR networks. This enables the LR network to acquire richer knowledge. This framework utilizes a combination loss function, incorporating cosine similarity measurement, soft loss based on probability distribution difference and gaze direction output, along with a hard loss from the LR network output layer. This approach on low‐resolution datasets derived from Gaze360 and RT‐Gene datasets is validated, demonstrating excellent performance in low‐resolution gaze estimation. Evaluations on low‐resolution images obtained through 2×, 4×, and 8× down‐sampling are conducted on two datasets. On the Gaze360 dataset, the lowest mean angular errors of 10.97°, 11.22°, and 13.61° were achieved, while on the RT‐Gene dataset, the lowest mean angular errors of 6.73°, 6.83°, and 7.75° were obtained.
Funder
Natural Science Foundation of Beijing Municipality
National Natural Science Foundation of China
Publisher
Institution of Engineering and Technology (IET)