Multi-Dataset Comparison of Vision Transformers and Convolutional Neural Networks for Detecting Glaucomatous Optic Neuropathy from Fundus Photographs

Author:

Hwang Elizabeth E.12,Chen Dake1,Han Ying1,Jia Lin3,Shan Jing1

Affiliation:

1. Department of Ophthalmology, University of California, San Francisco, San Francisco, CA 94143, USA

2. Medical Scientist Training Program, University of California, San Francisco, San Francisco, CA 94143, USA

3. Digillect LLC, San Francisco, CA 94158, USA

Abstract

Glaucomatous optic neuropathy (GON) can be diagnosed and monitored using fundus photography, a widely available and low-cost approach already adopted for automated screening of ophthalmic diseases such as diabetic retinopathy. Despite this, the lack of validated early screening approaches remains a major obstacle in the prevention of glaucoma-related blindness. Deep learning models have gained significant interest as potential solutions, as these models offer objective and high-throughput methods for processing image-based medical data. While convolutional neural networks (CNN) have been widely utilized for these purposes, more recent advances in the application of Transformer architectures have led to new models, including Vision Transformer (ViT,) that have shown promise in many domains of image analysis. However, previous comparisons of these two architectures have not sufficiently compared models side-by-side with more than a single dataset, making it unclear which model is more generalizable or performs better in different clinical contexts. Our purpose is to investigate comparable ViT and CNN models tasked with GON detection from fundus photos and highlight their respective strengths and weaknesses. We train CNN and ViT models on six unrelated, publicly available databases and compare their performance using well-established statistics including AUC, sensitivity, and specificity. Our results indicate that ViT models often show superior performance when compared with a similarly trained CNN model, particularly when non-glaucomatous images are over-represented in a given dataset. We discuss the clinical implications of these findings and suggest that ViT can further the development of accurate and scalable GON detection for this leading cause of irreversible blindness worldwide.

Funder

UCSF Initiative for Digital Transformation in Computational Biology & Health

All May See Foundation

Think Forward Foundation

UCSF Irene Perstein Award

National Institutes of Health under NCI Award

NIGMS

Publisher

MDPI AG

Subject

Bioengineering

Reference60 articles.

1. Global prevalence of glaucoma and projections of glaucoma burden through 2040: A systematic review and meta-analysis;Tham;Ophthalmology,2014

2. The changing face of primary open-angle glaucoma in the United States: Demographic and geographic changes from 2011 to 2050;Vajaranant;Arch. Ophthalmol.,2012

3. Glaucoma in Adults—Screening, Diagnosis, and Management: A Review;Stein;JAMA,2021

4. Screening for Glaucoma in Adults: Updated Evidence Report and Systematic Review for the US Preventive Services Task Force;Chou;JAMA,2022

5. A Review of Deep Learning for Screening, Diagnosis, and Detection of Glaucoma Progression;Thompson;Transl. Vis. Sci. Technol.,2020

Cited by 4 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3