Author:
Deane Oliver,Toth Eszter,Yeo Sang-Hoon
Abstract
AbstractWith continued advancements in portable eye-tracker technology liberating experimenters from the restraints of artificial laboratory designs, research can now collect gaze data from real-world, natural navigation. However, the field lacks a robust method for achieving this, as past approaches relied upon the time-consuming manual annotation of eye-tracking data, while previous attempts at automation lack the necessary versatility for in-the-wild navigation trials consisting of complex and dynamic scenes. Here, we propose a system capable of informing researchers of where and what a user’s gaze is focused upon at any one time. The system achieves this by first running footage recorded on a head-mounted camera through a deep-learning-based object detection algorithm called Masked Region-based Convolutional Neural Network (Mask R-CNN). The algorithm’s output is combined with frame-by-frame gaze coordinates measured by an eye-tracking device synchronized with the head-mounted camera to detect and annotate, without any manual intervention, what a user looked at for each frame of the provided footage. The effectiveness of the presented methodology was legitimized by a comparison between the system output and that of manual coders. High levels of agreement between the two validated the system as a preferable data collection technique as it was capable of processing data at a significantly faster rate than its human counterpart. Support for the system’s practicality was then further demonstrated via a case study exploring the mediatory effects of gaze behaviors on an environment-driven attentional bias.
Publisher
Springer Science and Business Media LLC
Subject
General Psychology,Psychology (miscellaneous),Arts and Humanities (miscellaneous),Developmental and Educational Psychology,Experimental and Cognitive Psychology
Reference61 articles.
1. Abdulla, W. (2017). Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow. Github Repository https://github.com/matterport/Mask_RCNN
2. Antonacopoulos, A., Bridson, D., Papadopoulos, C., & Pletschacher, S. (2009). A realistic dataset for performance evaluation of document layout analysis. In 2009 10th International Conference on Document Analysis and Recognition, 296–300. https://doi.org/10.1109/ICDAR.2009.271.
3. Asgari Taghanaki, S., Abhishek, K., Cohen, J. P., Cohen-Adad, J., & Hamarneh, G. (2021). Deep semantic segmentation of natural and medical images: A review. Artificial Intelligence Review, 54(1), 137–178. https://doi.org/10.1007/s10462-020-09854-1
4. Brainard, D. H. (1997). The psychophysics toolbox. Spatial Vision, 10(4), 433–436. https://doi.org/10.1163/156856897x00357
5. Bashiri, F. S., LaRose, E., Peissig, P., & Tafti, A. P. (2018). Mcindoor20000: A fully-labeled image dataset to advance indoor objects detection. Data in Brief, 17, 71–75. https://doi.org/10.1016/j.dib.2017.12.047
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献