Affiliation:
1. Research Division Geoinformation, Vienna University of Technology, Wiedner Hauptstraße 8/E120, 1040 Vienna, Austria
Abstract
In mobile eye-tracking research, the automatic annotation of fixation points is an important yet difficult task, especially in varied and dynamic environments such as outdoor urban landscapes. This complexity is increased by the constant movement and dynamic nature of both the observer and their environment in urban spaces. This paper presents a novel approach that integrates the capabilities of two foundation models, YOLOv8 and Mask2Former, as a pipeline to automatically annotate fixation points without requiring additional training or fine-tuning. Our pipeline leverages YOLO’s extensive training on the MS COCO dataset for object detection and Mask2Former’s training on the Cityscapes dataset for semantic segmentation. This integration not only streamlines the annotation process but also improves accuracy and consistency, ensuring reliable annotations, even in complex scenes with multiple objects side by side or at different depths. Validation through two experiments showcases its efficiency, achieving 89.05% accuracy in a controlled data collection and 81.50% accuracy in a real-world outdoor wayfinding scenario. With an average runtime per frame of 1.61 ± 0.35 s, our approach stands as a robust solution for automatic fixation annotation.
Reference42 articles.
1. Eye tracking for spatial research: Cognition, computation, challenges;Kiefer;Spat. Cogn. Comput.,2017
2. Alinaghi, N., Kattenbeck, M., Golab, A., and Giannopoulos, I. (2021, January 27–30). Will you take this turn? gaze-based turning activity recognition during navigation. Proceedings of the 11th International Conference on Geographic Information Science (GIScience 2021)-Part II, Online.
3. Where am I? Investigating map matching during self-localization with mobile eye tracking in an urban environment;Kiefer;Trans. GIS,2014
4. Deep-SAGA: A deep-learning-based system for automatic gaze annotation from eye-tracking data;Deane;Behav. Res. Methods,2023
5. Software architecture for automating cognitive science eye-tracking data analysis and object annotation;Panetta;IEEE Trans. Hum.-Mach. Syst.,2019