Differences and similarities in false interpretations by AI CAD and radiologists in screening mammography

Author:

Salim Mattie12ORCID,Dembrower Karin13,Eklund Martin4,Smith Kevin5,Strand Fredrik12

Affiliation:

1. Department of Oncology and Pathology, Karolinska Institute, Karolinska University Hospital, Stockholm, Sweden

2. Department of Radiology, Breast Radiology, Karolinska University Hospital, Stockholm, Sweden

3. Department of Radiology, Breast Radiology, Capio Sankt Görans Hospital, Stockholm, Sweden

4. Department of Medical Epidemiology and Biostatistics, Karolinska Institute, Stockholm, Sweden

5. Science for Life Laboratory, KTH Royal Insitute of Technology, Stockholm, Sweden

Abstract

Objective: We aimed to evaluate the false interpretations between artificial intelligence (AI) and radiologists in screening mammography to get a better understanding of how the distribution of diagnostic mistakes might change when moving from entirely radiologist-driven to AI-integrated breast cancer screening. Methods and materials: This retrospective case–control study was based on a mammography screening cohort from 2008 to 2015. The final study population included screening examinations for 714 women diagnosed with breast cancer and 8029 randomly selected healthy controls. Oversampling of controls was applied to attain a similar cancer proportion as in the source screening cohort. We examined how false-positive (FP) and false-negative (FN) assessments by AI, the first reader (RAD 1) and the second reader (RAD 2), were associated with age, density, tumor histology and cancer invasiveness in a single- and double-reader setting. Results: For each reader, the FN assessments were distributed between low- and high-density females with 53 (42%) and 72 (58%) for AI; 59 (36%) and 104 (64%) for RAD 1 and 47 (36%) and 84 (64%) for RAD 2. The corresponding numbers for FP assessments were 1820 (47%) and 2016 (53%) for AI; 1568 (46%) and 1834 (54%) for RAD 1 and 1190 (43%) and 1610 (58%) for RAD 2. For ductal cancer, the FN assessments were 79 (77%) for AI CAD; with 120 (83%) for RAD 1 and with 96 (16%) for RAD 2. For the double-reading simulation, the FP assessments were distributed between younger and older females with 2828 (2.5%) and 1554 (1.4%) for RAD 1 + RAD 2; 3850 (3.4%) and 2940 (2.6%) for AI+RAD 1 and 3430 (3%) and 2772 (2.5%) for AI+RAD 2. Conclusion: The most pronounced decrease in FN assessments was noted for females over the age of 55 and for high density-women. In conclusion, AI could have an important complementary role when combined with radiologists to increase sensitivity for high-density and older females. Advances in knowledge: Our results highlight the potential impact of integrating AI in breast cancer screening, particularly to improve interpretation accuracy. The use of AI could enhance screening outcomes for high-density and older females.

Publisher

Oxford University Press (OUP)

Subject

Radiology, Nuclear Medicine and imaging,General Medicine

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3