Author:
Kuzinkovas Domantas,Clement Sandhya
Abstract
AbstractAdvances in the field of image classification using convolutional neural networks (CNNs) have greatly improved the accuracy of medical image diagnosis by radiologists. Numerous research groups have applied CNN methods to diagnose respiratory illnesses from chest x-rays, and have extended this work to prove the feasibility of rapidly diagnosing COVID-19 to high degrees of accuracy. One issue in previous research has been the use of datasets containing only a few hundred images of chest x-rays containing COVID-19, causing CNNs to overfit the image data. This leads to a lower accuracy when the model attempts to classify new images, as would be clinically expected of it. In this work, we present a model trained on the COVID-QU-Ex dataset, overall containing 33,920 chest x-ray images, with an equal share of COVID-19, Non-COVID pneumonia, and Normal images. The model itself is an ensemble of pre-trained CNNs (ResNet50, VGG19, VGG16) and GLCM textural features. It achieved a 98.34% binary classification accuracy (COVID-19/no COVID-19) on a balanced test dataset of 6581 chest x-rays, and 94.68% for distinguishing between COVID-19, Non-COVID pneumonia and normal chest x-rays. Also, we herein discuss the effects of dataset size, demonstrating that a 98.82% 3-class accuracy can be achieved using the model if the training dataset only contains a few thousand images, but that generalisability of the model suffers with such small datasets.
Publisher
Cold Spring Harbor Laboratory