Abstract
AbstractPurposeTo determine whether AI significantly affects the performance of diabetic retinopathy (DR) grading by ophthalmology residents. Secondary objectives included evaluation of AI’s effects on intergrader variability, self-reported confidence, and decision making.MethodsFour ophthalmology residents at a single academic medical center across all years of training (PGY-2 to PGY-4) analyzed 265 retinal fundus photographs for diabetic retinopathy from a publicly available dataset without and with the assistance of an AI algorithm, separated by a 3-week washout periodResultsOverall, there was no significant difference without versus with AI in five-class grading, as measured by QWK, with differences ranging from +0.010-0.017, p=0.09-0.32. No significant difference without and with AI was observed for binary classification of referable DR, except for the specificity of the PGY-3 resident (71.8% to 80%, p=0.019). Intergrader agreement among residents significantly increased with AI (FK +0.072, p=0.0003). Self-reported confidence also significantly increased for 3 out of 4 residents.ConclusionThe use of an AI algorithm did not significantly affect the DR grading performance of ophthalmology residents but did increase intergrader agreement and self-reported confidence. Introducing AI into the ophthalmology residency curriculum may be beneficial as the technology becomes more prevalent.Summary StatementA cross-sectional study that evaluated the performance of ophthalmology residents grading diabetic retinopathy fundus photographs with and without the assistance of an artificial intelligence algorithm.
Publisher
Cold Spring Harbor Laboratory