Visualizing the Interpretation of a Criterion-Driven System that Automatically Evaluates the Quality of Health News: an Exploratory Study of Two Approaches (Preprint)

Author:

Liu XiaoyuORCID,Alsghaier Hiba,Ataullah Amna,Mcroy SusanORCID

Abstract

BACKGROUND

Machine learning techniques have been shown efficient at identifying health misinformation. However, interpreting a classification model remains challenging due to the model's intricacy. The absence of a justification for the classification result and disclosure of the model's domain knowledge may erode end-users’ trust in such models. This diminished trust may also undermine the effectiveness of artificial intelligence-based initiatives to counteract health misinformation.

OBJECTIVE

The study objective is to address the public's need for help evaluating the quality of health news and the typical opaqueness of an AI approach. This work aims to create an interpretable, criteria-driven system. We also aim to find the best approach by comparing the accuracy of two methods for selecting and visualizing the system's interpretation. Both methods highlight the sentences that contributed to the analysis performed by the system for a given criterion.

METHODS

We employ an interpretable, criteria-based approach for automatically assessing the quality of health news on the Internet. One of ten well-established criteria was chosen for our experimentation. To automate the evaluation of the criterion, we tested Logistic Regression, Naive Bayes, Support Vector Machine, and Random Forest algorithms. We then experimented with two approaches for developing interpretable representations of the results. For the first approach, (1) we calculate word feature weights, which explains how classification models distill keywords that are relevant to the positive prediction;(2) then using the Local Interpretable Model Explanations (LIME) framework, we automatically select keywords for visualization that show how classification models identify positive news articles, based on related keywords;(3)and finally, the system highlights target sentences containing keywords to justify the criterion evaluation result. For the second approach, (1) we extract sentences that provide evidence to support the evaluation result from 100 health news articles, (2) based on these results, we train a typology classification model at a sentence level; (3) then, the system highlights positive sentence instances for the result justification. We assess the accuracy of both methods using a small held-out test set.

RESULTS

The Random Forest model achieved the highest average AUC score of 0.74 in evaluating the criterion. Additionally, LIME provided a visualization of how keywords impact the system's evaluation results. Both approaches could successfully visualize the interpretation of the system. When tested with four cases, the hybrid approach performs slightly better in highlighting correct sentences with an accuracy of 55.00%, compared to 50.00% by the typology approach.

CONCLUSIONS

We provide an interpretable, criteria-based method to evaluate health news. The method incorporates rule-based and statistical machine learning approaches. Results suggest one might automate criterion-based health news quality evaluation successfully using either approach, but larger differences may arise when multiple quality-related criteria are considered. This work can increase public trust in computerized health information evaluation.

Publisher

JMIR Publications Inc.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3