Are deep learning classification results obtained on CT scans fair and interpretable?-Reference-Cited by-同舟云学术

Are deep learning classification results obtained on CT scans fair and interpretable?

Published:2024-04-04 Issue: Volume: Page:
ISSN:2662-4729
Container-title:Physical and Engineering Sciences in Medicine
language:en
Short-container-title:Phys Eng Sci Med

Author:

Ashames Mohamad M. A.,Demir Ahmet,Gerek Omer N.,Fidan Mehmet,Gulmezoglu M. Bilginer,Ergin Semih,Edizkan Rifat,Koc Mehmet^ORCID,Barkana Atalay,Calisir Cuneyt

Abstract

AbstractFollowing the great success of various deep learning methods in image and object classification, the biomedical image processing society is also overwhelmed with their applications to various automatic diagnosis cases. Unfortunately, most of the deep learning-based classification attempts in the literature solely focus on the aim of extreme accuracy scores, without considering interpretability, or patient-wise separation of training and test data. For example, most lung nodule classification papers using deep learning randomly shuffle data and split it into training, validation, and test sets, causing certain images from the Computed Tomography (CT) scan of a person to be in the training set, while other images of the same person to be in the validation or testing image sets. This can result in reporting misleading accuracy rates and the learning of irrelevant features, ultimately reducing the real-life usability of these models. When the deep neural networks trained on the traditional, unfair data shuffling method are challenged with new patient images, it is observed that the trained models perform poorly. In contrast, deep neural networks trained with strict patient-level separation maintain their accuracy rates even when new patient images are tested. Heat map visualizations of the activations of the deep neural networks trained with strict patient-level separation indicate a higher degree of focus on the relevant nodules. We argue that the research question posed in the title has a positive answer only if the deep neural networks are trained with images of patients that are strictly isolated from the validation and testing patient sets.

Funder

Eskisehir Technical University

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s13246-024-01419-8.pdf

Reference49 articles.

1. Abbas Q (2017) Nodular-deep: classification of pulmonary nodules using deep neural network. Int J Med Res Heal Sci 6(8):111–118

2. Aggarwal T., Furqan A., Kalra K (2015) Feature extraction and lda based classification of lung nodules in chest ct scan images. In: 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pages 1189–1193. IEEE

3. Agushaka JO, Ezugwu AE, Abualigah L (2022) Dwarf mongoose optimization algorithm. Comput Methods Appl Mech Eng 391:114570

4. Agushaka JO, Ezugwu AE, Abualigah L (2023) Gazelle optimization algorithm: a novel nature-inspired metaheuristic optimizer. Neural Comput Appl 35(5):4099–4131

5. Armato SG III, McLennan G, Bidaut L, McNitt-Gray MF, Meyer CR, Reeves AP, Zhao B, Aberle DR, Henschke CI, Hoffman EA et al (2011) The lung image database consortium (lidc) and image database resource initiative (idri): a completed reference database of lung nodules on ct scans. Med Phys 38(2):915–931