JustSpeak: Automated, User-Configurable, Interactive Agents for Speech Tutoring-Reference-Cited by-同舟云学术

JustSpeak: Automated, User-Configurable, Interactive Agents for Speech Tutoring

Published:2021-05-27 Issue:EICS Volume:5 Page:1-24
ISSN:2573-0142
Container-title:Proceedings of the ACM on Human-Computer Interaction
language:en
Short-container-title:Proc. ACM Hum.-Comput. Interact.

Author:

Zhang Xinlei¹,Miyaki Takashi²,Rekimoto Jun³

Affiliation:

1. The University of Tokyo / Rekimoto Lab, Tokyo, Japan

2. The University of Tokyo, Bunkyo-ku, Tokyo, Japan

3. The University of Tokyo, Tokyo, Japan

Abstract

Conversational agents are widely used in many situations, especially for speech tutoring. However, their contents and functions are often pre-defined and not customizable for people without technical backgrounds, thus significantly limiting their flexibility and usability. Besides, conventional agents often cannot provide feedback in the middle of training sessions because they lack technical approaches to evaluate users' speech dynamically. We propose JustSpeak: automated and interactive speech tutoring agents with various configurable feedback mechanisms, using any speech recordings with its transcription text as the template for speech training. In JustSpeak, we developed an automated procedure to generate customized tutoring agents from user-inputted templates. Moreover, we created a set of methods to dynamically synchronize speech recognizers' behavior with the agent's tutoring progress, making it possible to detect various speech mistakes dynamically such as being stuck, mispronunciation, and rhythm deviations. Furthermore, we identified the design primitives in JustSpeak to create different novel feedback mechanisms, such as adaptive playback, follow-on training, and passive adaptation. They can be combined to create customized tutoring agents, which we demonstrate with an example for language learning. We believe JustSpeak can create more personalized speech learning opportunities by enabling tutoring agents that are customizable, always available, and easy-to-use.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications,Human-Computer Interaction,Social Sciences (miscellaneous)

Link

https://dl.acm.org/doi/pdf/10.1145/3459744

Reference41 articles.

1. [n.d.]. openFrameworks. https://openframeworks.cc/. [n.d.]. openFrameworks. https://openframeworks.cc/.

2. Jeesoo Bang Sechun Kang and Gary Geunbae Lee. 2013. An automatic feedback system for English speaking integrating pronunciation and prosody assessments. In Speech and Language Technology in Education. Jeesoo Bang Sechun Kang and Gary Geunbae Lee. 2013. An automatic feedback system for English speaking integrating pronunciation and prosody assessments. In Speech and Language Technology in Education.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Ask a Further Question or Give a List? How Should Conversational Agents Reply to Users’ Uncertain Queries;International Journal of Human–Computer Interaction;2022-10-14