Abstract
Objective
Worldwide, glaucoma is a leading cause of irreversible blindness. Timely detection is paramount yet challenging, particularly in resource-limited settings. A novel, computer vision-based model for glaucoma screening using fundus images could enhance early and accurate disease detection. Herein, we developed and validated a generalized deep-learning-based algorithm for screening glaucoma using fundus images.
Methods
The glaucomatous fundus data were collected from 20 publicly accessible databases worldwide, resulting in 18,468 images from multiple clinical settings, of which 10,900 were classified as healthy and 7,568 as glaucoma. All the data were evaluated and downsized to fit the model's input requirements. The potential model was selected from 20 pre-trained models and trained on the whole dataset except Drishti-GS. The best-performing model was further trained for classifying healthy and glaucomatous fundus images using Fastai and PyTorch libraries. The model's performance was compared against the actual class using the area under the receiver operating characteristic (AUROC), sensitivity, specificity, accuracy, precision, and the F1-score.
Results
The high discriminative ability of the best-performing model was evaluated on a dataset comprising 1,364 glaucomatous discs and 2,047 healthy discs. The model reflected robust performance metrics, with an AUROC of 0.9920 (95% CI: 0.9920 to 0.9921) for both the glaucoma and healthy classes. The sensitivity, specificity, accuracy, precision, recall, and F1-scores were consistently higher than 0.9530 for both classes. The model performed well on an external validation set of the Drishti-GS dataset, with an AUROC of 0.8751 and an accuracy of 0.8713.
Conclusions
This study demonstrated the high efficacy of our classification model in distinguishing between glaucomatous and healthy discs. However, the model's accuracy slightly dropped when evaluated on unseen data, indicating potential inconsistencies among the datasets—the model needs to be refined and validated on larger, more diverse datasets to ensure reliability and generalizability. Despite this, our model can be utilized for screening glaucoma at the population level.