Affiliation:
1. Dartmouth College
2. Stanford University
3. Georgia Institute of Technology
4. University of California Santa Babara
5. University of Arizona
Abstract
Visual focus of attention in multi-person discussions is a crucial nonverbal indicator in tasks such as inter-personal relation inference, speech transcription, and deception detection. However, predicting the focus of attention remains a challenge because the focus changes rapidly, the discussions are highly dynamic, and the people's behaviors are inter-dependent. Here we propose ICAF (Iterative Collective Attention Focus), a collective classification model to jointly learn the visual focus of attention of all people. Every person is modeled using a separate classifier. ICAF models the people collectively---the predictions of all other people's classifiers are used as inputs to each person's classifier. This explicitly incorporates inter-dependencies between all people's behaviors. We evaluate ICAF on a novel dataset of 5 videos (35 people, 109 minutes, 7604 labels in all) of the popular Resistance game and a widely-studied meeting dataset with supervised prediction. See our demo at https://cs.dartmouth.edu/dsail/demos/icaf. ICAF outperforms the strongest baseline by 1%--5% accuracy in predicting the people's visual focus of attention.
Further, we propose a lightly supervised technique to train models in the absence of training labels. We show that light-supervised ICAF performs similar to the supervised ICAF,
thus showing its effectiveness and generality to previously unseen videos.
Publisher
International Joint Conferences on Artificial Intelligence Organization
Cited by
12 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献