Technical Metrics Used to Evaluate Health Care Chatbots: Scoping Review-Reference-Cited by-同舟云学术

Technical Metrics Used to Evaluate Health Care Chatbots: Scoping Review

Published:2020-06-05 Issue:6 Volume:22 Page:e18301
ISSN:1438-8871
Container-title:Journal of Medical Internet Research
language:en
Short-container-title:J Med Internet Res

Author:

Abd-Alrazaq Alaa^ORCID,Safi Zeineb^ORCID,Alajlani Mohannad^ORCID,Warren Jim^ORCID,Househ Mowafa^ORCID,Denecke Kerstin^ORCID

Abstract

BackgroundDialog agents (chatbots) have a long history of application in health care, where they have been used for tasks such as supporting patient self-management and providing counseling. Their use is expected to grow with increasing demands on health systems and improving artificial intelligence (AI) capability. Approaches to the evaluation of health care chatbots, however, appear to be diverse and haphazard, resulting in a potential barrier to the advancement of the field.ObjectiveThis study aims to identify the technical (nonclinical) metrics used by previous studies to evaluate health care chatbots.MethodsStudies were identified by searching 7 bibliographic databases (eg, MEDLINE and PsycINFO) in addition to conducting backward and forward reference list checking of the included studies and relevant reviews. The studies were independently selected by two reviewers who then extracted data from the included studies. Extracted data were synthesized narratively by grouping the identified metrics into categories based on the aspect of chatbots that the metrics evaluated.ResultsOf the 1498 citations retrieved, 65 studies were included in this review. Chatbots were evaluated using 27 technical metrics, which were related to chatbots as a whole (eg, usability, classifier performance, speed), response generation (eg, comprehensibility, realism, repetitiveness), response understanding (eg, chatbot understanding as assessed by users, word error rate, concept error rate), and esthetics (eg, appearance of the virtual agent, background color, and content).ConclusionsThe technical metrics of health chatbot studies were diverse, with survey designs and global usability metrics dominating. The lack of standardization and paucity of objective measures make it difficult to compare the performance of health chatbots and could inhibit advancement of the field. We suggest that researchers more frequently include metrics computed from conversation logs. In addition, we recommend the development of a framework of technical metrics with recommendations for specific circumstances for their inclusion in chatbot studies.

Publisher

JMIR Publications Inc.

Subject

Health Informatics

Reference90 articles.

1. ELIZA—a computer program for the study of natural language communication between man and machine

2. Mobile phone-based interventions for smoking cessation

3. Effectiveness of web-based interventions in achieving weight loss and weight loss maintenance in overweight and obese adults: a systematic review with meta-analysis

4. Using the Internet to Promote Health Behavior Change: A Systematic Review and Meta-analysis of the Impact of Theoretical Basis, Use of Behavior Change Techniques, and Mode of Delivery on Efficacy

Cited by 70 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Conversational Agents in Healthcare: A Variability Perspective;Proceedings of the 18th International Working Conference on Variability Modelling of Software-Intensive Systems;2024-02-07

2. Roles, Users, Benefits and Limitations of Chatbots in Healthcare: Rapid Review (Preprint);2024-02-02

3. Embodied Conversational Agents for Chronic Diseases: Scoping Review;Journal of Medical Internet Research;2024-01-09

4. Evaluation framework for conversational agents with artificial intelligence in health interventions: a systematic scoping review;Journal of the American Medical Informatics Association;2023-12-09

5. Psychological insights into the research and practice of embodied conversational agents, chatbots and social assistive robots: a systematic meta-review;Behaviour & Information Technology;2023-11-27