Advancements in Gaze Coordinate Prediction Using Deep Learning: A Novel Ensemble Loss Approach
-
Published:2024-06-20
Issue:12
Volume:14
Page:5334
-
ISSN:2076-3417
-
Container-title:Applied Sciences
-
language:en
-
Short-container-title:Applied Sciences
Author:
Kim Seunghyun1ORCID, Lee Seungkeon1ORCID, Lee Eui Chul2ORCID
Affiliation:
1. Department of AI & Informatics, Graduate School, Sangmyung University, Seoul 03016, Republic of Korea 2. Department of Human-Centered Artificial Intelligence, Sangmyung University, Seoul 03016, Republic of Korea
Abstract
Recent advancements in deep learning have enabled gaze estimation from images of the face and eye areas without the need for precise geometric locations of the eyes and face. This approach eliminates the need for complex user-dependent calibration and the issues associated with extracting and tracking geometric positions, making further exploration of gaze position performance enhancements challenging. Motivated by this, our study focuses on an ensemble loss function that can enhance the performance of existing 2D-based deep learning models for gaze coordinate (x, y) prediction. We propose a new function and demonstrate its effectiveness by applying it to models from prior studies. The results show significant performance improvements across all cases. When applied to ResNet and iTracker models, the average absolute error reduced significantly from 7.5 cm to 1.2 cm and from 7.67 cm to 1.3 cm, respectively. Notably, when implemented on the AFF-Net, which boasts state-of-the-art performance, the average absolute error was reduced from 4.21 cm to 0.81 cm, based on our MPIIFaceGaze dataset. Additionally, predictions for ranges never encountered during the training phase also displayed a very low error of 0.77 cm in terms of MAE without any personalization process. These findings suggest significant potential for accuracy improvements while maintaining computational complexity similar to the existing models without the need for creating additional or more complex models.
Funder
Sangmyung University
Reference24 articles.
1. Majaranta, P., and Räihä, K.J. (2002, January 25–27). Twenty Years of Eye Typing: Systems and Design Issues. Proceedings of the ETRA ’02: 2002 Symposium on Eye Tracking Research & Applications, New Orleans, LA, USA. 2. Calibration-free and deep-learning-based customer gaze direction detection technology based on the YOLOv3-tiny model for smart advertising displays;Ou;J. Chin. Inst. Eng.,2023 3. He, H., She, Y., Xiahou, J., Yao, J., Li, J., Hong, Q., and Ji, Y. (2018, January 11–14). Real-Time Eye-Gaze Based Interaction for Human Intention Prediction and Emotion Analysis. Proceedings of the CGI 2018: Computer Graphics International, Bintan Island, Indonesia. 4. Damm, O., Malchus, K., Jaecks, P., Krach, S., Paulus, F., Naber, M., Jansen, A., Kamp-Becker, I., Einhäuser, W., and Stenneken, P. (2013, January 26–29). Different gaze behavior in human-robot interaction in Asperger’s syndrome: An eye-tracking study. Proceedings of the 2013 IEEE RO-MAN, Gyeongju, Republic of Korea. 5. A Survey on Eye-Gaze Tracking Techniques;Chennamma;Indian J. Comput. Sci. Eng.,2013
|
|