Affiliation:
1. University of California, Santa Cruz, CA, USA
Abstract
Conversational agents (CAs)
are increasingly ubiquitous and are now commonly used to access medical information. However, we lack systematic data about the quality of advice such agents provide. This paper evaluates CA advice for
mental health (MH)
questions, a pressing issue given that we are undergoing a mental health crisis. Building on prior work, we define a new method to systematically evaluate mental health responses from CAs. We develop multi-utterance conversational probes derived from two widely used mental health diagnostic surveys, the PHQ-9 (Depression) and the GAD-7 (Anxiety). We evaluate the responses of two text-based chatbots and four voice assistants to determine whether CAs provide relevant responses and treatments. Evaluations were conducted both by clinicians and immersively by trained raters, yielding consistent results across all raters. Although advice and recommendations were generally low quality, they were better for Crisis probes and for probes concerning symptoms of Anxiety rather than Depression. Responses were slightly improved for text versus speech-based agents, and when CAs had access to extended dialogue context. Design implications include suggestions for improved responses through clarification sub-dialogues. Responses may also be improved by the incorporation of empathy although this needs to be combined with effective treatments or advice.
Publisher
Association for Computing Machinery (ACM)
Subject
Artificial Intelligence,Human-Computer Interaction
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献