Quality evaluation of digital voice assistants for diabetes management

Author:

Chia Joy Qi En1,Wong Li Lian1,Yap Kevin Yi-Lwern23

Affiliation:

1. Department of Pharmacy, Faculty of Science, National University of Singapore, Block S4A, Level 2, 18 Science Drive 4, Singapore 117543, Singapore

2. Department of Pharmacy, Singapore General Hospital, SingHealth Tower, 10 Hospital Boulevard, Lobby A, Level 9, Singapore 168582, Singapore

3. Department of Public Health, School of Psychology and Public Health, La Trobe University, Melbourne (Bundoora), Victoria 3086, Australia

Abstract

<abstract><sec> <title>Background</title> <p>Digital voice assistants (DVAs) are increasingly used to search for health information. However, the quality of information provided by DVAs is not consistent across health conditions. From our knowledge, there have been no studies that evaluated the quality of DVAs in response to diabetes-related queries. The objective of this study was to evaluate the quality of DVAs in relation to queries on diabetes management.</p> </sec><sec> <title>Materials and methods</title> <p>Seventy-four questions were posed to smartphone (Apple Siri, Google Assistant, Samsung Bixby) and non-smartphone DVAs (Amazon Alexa, Sulli the Diabetes Guru, Google Nest Mini, Microsoft Cortana), and their responses were compared to that of Internet Google Search. Questions were categorized under diagnosis, screening, management, treatment and complications of diabetes, and the impacts of COVID-19 on diabetes. The DVAs were evaluated on their technical ability, user-friendliness, reliability, comprehensiveness and accuracy of their responses. Data was analyzed using the Kruskal-Wallis and Wilcoxon rank-sum tests. Intraclass correlation coefficient was used to report inter-rater reliability.</p> </sec><sec> <title>Results</title> <p>Google Assistant (n = 69/74, 93.2%), Siri and Nest Mini (n = 64/74, 86.5% each) had the highest proportions of successful and relevant responses, in contrast to Cortana (n = 23/74, 31.1%) and Sulli (n = 10/74, 13.5%), which had the lowest successful and relevant responses. Median total scores of the smartphone DVAs (Bixby 75.3%, Google Assistant 73.3%, Siri 72.0%) were comparable to that of Google Search (70.0%, p = 0.034), while median total scores of non-smartphone DVAs (Nest Mini 56.9%, Alexa 52.9%, Cortana 52.5% and Sulli the Diabetes Guru 48.6%) were significantly lower (p &lt; 0.001). Non-smartphone DVAs had much lower median comprehensiveness (16.7% versus 100.0%, p &lt; 0.001) and reliability scores (30.8% versus 61.5%, p &lt; 0.001) compared to Google Search.</p> </sec><sec> <title>Conclusions</title> <p>Google Assistant, Siri and Bixby were the best-performing DVAs for answering diabetes-related queries. However, the lack of successful and relevant responses by Bixby may frustrate users, especially if they have COVID-19 related queries. All DVAs scored highly for user-friendliness, but can be improved in terms of accuracy, comprehensiveness and reliability. DVA designers are encouraged to consider features related to accuracy, comprehensiveness, reliability and user-friendliness when developing their products, so as to enhance the quality of DVAs for medical purposes, such as diabetes management.</p> </sec></abstract>

Publisher

American Institute of Mathematical Sciences (AIMS)

Subject

General Medicine

Reference78 articles.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3