Abstract
Recent methods for automatic blood vessel segmentation from fundus images have been commonly implemented as convolutional neural networks. While these networks report high values for objective metrics, the clinical viability of recovered segmentation masks remains unexplored. In this paper, we perform a pilot study to assess the clinical viability of automatically generated segmentation masks in the diagnosis of diseases affecting retinal vascularization. Five ophthalmologists with clinical experience were asked to participate in the study. The results demonstrate low classification accuracy, inferring that generated segmentation masks cannot be used as a standalone resource in general clinical practice. The results also hint at possible clinical infeasibility in experimental design. In the follow-up experiment, we evaluate the clinical quality of masks by having ophthalmologists rank generation methods. The ranking is established with high intra-observer consistency, indicating better subjective performance for a subset of tested networks. The study also demonstrates that objective metrics are not correlated with subjective metrics in retinal segmentation tasks for the methods involved, suggesting that objective metrics commonly used in scientific papers to measure the method’s performance are not plausible criteria for choosing clinically robust solutions.
Funder
Institute for Artificial Intelligence Research and Development of Serbia
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry