Differences between human and machine perception in medical diagnosis

Author:

Makino Taro,Jastrzębski Stanisław,Oleszkiewicz Witold,Chacko Celin,Ehrenpreis Robin,Samreen Naziya,Chhor Chloe,Kim Eric,Lee Jiyon,Pysarenko Kristine,Reig Beatriu,Toth Hildegard,Awal Divya,Du Linda,Kim Alice,Park James,Sodickson Daniel K.,Heacock Laura,Moy Linda,Cho Kyunghyun,Geras Krzysztof J.

Abstract

AbstractDeep neural networks (DNNs) show promise in image-based medical diagnosis, but cannot be fully trusted since they can fail for reasons unrelated to underlying pathology. Humans are less likely to make such superficial mistakes, since they use features that are grounded on medical science. It is therefore important to know whether DNNs use different features than humans. Towards this end, we propose a framework for comparing human and machine perception in medical diagnosis. We frame the comparison in terms of perturbation robustness, and mitigate Simpson’s paradox by performing a subgroup analysis. The framework is demonstrated with a case study in breast cancer screening, where we separately analyze microcalcifications and soft tissue lesions. While it is inconclusive whether humans and DNNs use different features to detect microcalcifications, we find that for soft tissue lesions, DNNs rely on high frequency components ignored by radiologists. Moreover, these features are located outside of the region of the images found most suspicious by radiologists. This difference between humans and machines was only visible through subgroup analysis, which highlights the importance of incorporating medical domain knowledge into the comparison.

Funder

National Science Foundation

National Institutes of Health

Gordon and Betty Moore Foundation

Publisher

Springer Science and Business Media LLC

Subject

Multidisciplinary

Reference47 articles.

1. Krizhevsky, A., Sutskever, I., & Hinton, G. E. ImageNet classification with deep convolutional neural networks. In NIPS 1106–1114 (2012).

2. Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. In ICLR (2015).

3. Ren, S., He, K., Girshick, R. B., & Sun, J. Faster R-CNN: towards real-time object detection with region proposal networks. In NIPS 91–99 (2015).

4. Redmon, J., Divvala, S. K., Girshick, R. B., & Farhadi, A. You only look once: unified, real-time object detection. In CVPR 779–788 (2016).

5. He, K., Zhang, X., Ren, S., & Sun, J. Deep residual learning for image recognition. In CVPR 770–778 (2016).

Cited by 11 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. A Machine Walks into an Exhibit: A Technical Analysis of Art Curation;Arts;2024-08-31

2. Interpretable rotator cuff tear diagnosis using MRI slides with CAMscore and SHAP;Medical Imaging 2024: Computer-Aided Diagnosis;2024-04-03

3. Product liability for defective AI;European Journal of Law and Economics;2024-02-27

4. Artificial Intelligence for Drug Discovery: Are We There Yet?;Annual Review of Pharmacology and Toxicology;2024-01-23

5. Neural network structure simplification by assessing evolution in node weight magnitude;Machine Learning;2023-12-20

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3