Facial feedback for reinforcement learning: a case study and offline analysis using the TAMER framework-Reference-Cited by-同舟云学术

Facial feedback for reinforcement learning: a case study and offline analysis using the TAMER framework

Published:2020-02-12 Issue:1 Volume:34 Page:
ISSN:1387-2532
Container-title:Autonomous Agents and Multi-Agent Systems
language:en
Short-container-title:Auton Agent Multi-Agent Syst

Author:

Li Guangliang^ORCID,Dibeklioğlu Hamdi^ORCID,Whiteson Shimon,Hung Hayley

Abstract

AbstractInteractive reinforcement learning provides a way for agents to learn to solve tasks from evaluative feedback provided by a human user. Previous research showed that humans give copious feedback early in training but very sparsely thereafter. In this article, we investigate the potential of agent learning from trainers’ facial expressions via interpreting them as evaluative feedback. To do so, we implemented TAMER which is a popular interactive reinforcement learning method in a reinforcement-learning benchmark problem—Infinite Mario, and conducted the first large-scale study of TAMER involving 561 participants. With designed CNN–RNN model, our analysis shows that telling trainers to use facial expressions and competition can improve the accuracies for estimating positive and negative feedback using facial expressions. In addition, our results with a simulation experiment show that learning solely from predicted feedback based on facial expressions is possible and using strong/effective prediction models or a regression method, facial responses would significantly improve the performance of agents. Furthermore, our experiment supports previous studies demonstrating the importance of bi-directional feedback and competitive elements in the training interface.

Funder

National Natural Science Foundation of China

Natural Science Foundation of Shandong Province

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence

Link

http://link.springer.com/content/pdf/10.1007/s10458-020-09447-w.pdf

Reference55 articles.

1. Argall, B. D., Chernova, S., Veloso, M., & Browning, B. (2009). A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5), 469–483.

2. Berridge, K. C. (2003). Pleasures of the brain. Brain and Cognition, 52(1), 106–128.

3. Blumberg, B., Downie, M., Ivanov, Y., Berlin, M., Johnson, M. P., & Tomlinson, B. (2002). Integrated learning for interactive synthetic characters. ACM Transactions on Graphics (TOG), 21(3), 417–426.

4. Broekens, J. (2007). Emotion and reinforcement: Affective facial expressions facilitate robot learning. In T. S. Huang, A. Nijholt, M. Pantic, & A. Pentland (Eds.), Artificial intelligence for human computing, pp. 113–132. Berlin: Springer.

5. Cohn, J. F., Kruez, T. S., Matthews, I., Yang, Y., Nguyen, M. H., Padilla, M. T., et al. (2009). Detecting depression from facial actions and vocal prosody. In Affective computing and intelligent interaction and workshops, 2009. ACII 2009. 3rd international conference on (pp. 1–7). IEEE.

Cited by 13 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Impact of modern simulators on the development of teamwork skills: coordinated action and communication;Journal of Modern Foreign Psychology;2024-07-22

2. REACT: Two Datasets for Analyzing Both Human Reactions and Evaluative Feedback to Robots Over Time;Proceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction;2024-03-11

3. Leveraging Implicit Human Feedback to Better Learn from Explicit Human Feedback in Human-Robot Interactions;Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction;2024-03-11

4. Affective Computing for Human-Robot Interaction Research: Four Critical Lessons for the Hitchhiker;2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN);2023-08-28

5. EWareNet: Emotion-Aware Pedestrian Intent Prediction and Adaptive Spatial Profile Fusion for Social Robot Navigation;2023 IEEE International Conference on Robotics and Automation (ICRA);2023-05-29