Machine and Deep Learning for Tuberculosis Detection on Chest X-Rays: Systematic Literature Review (Preprint)

Author:

Hansun SengORCID,Argha AhmadrezaORCID,Liaw Siaw-TengORCID,Celler Branko GORCID,Marks Guy BORCID

Abstract

BACKGROUND

Tuberculosis (TB) was the leading infectious cause of mortality globally prior to COVID-19 and chest radiography has an important role in the detection, and subsequent diagnosis, of patients with this disease. The conventional experts reading has substantial within- and between-observer variability, indicating poor reliability of human readers. Substantial efforts have been made in utilizing various artificial intelligence–based algorithms to address the limitations of human reading of chest radiographs for diagnosing TB.

OBJECTIVE

This systematic literature review (SLR) aims to assess the performance of machine learning (ML) and deep learning (DL) in the detection of TB using chest radiography (chest x-ray [CXR]).

METHODS

In conducting and reporting the SLR, we followed the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. A total of 309 records were identified from Scopus, PubMed, and IEEE (Institute of Electrical and Electronics Engineers) databases. We independently screened, reviewed, and assessed all available records and included 47 studies that met the inclusion criteria in this SLR. We also performed the risk of bias assessment using Quality Assessment of Diagnostic Accuracy Studies version 2 (QUADAS-2) and meta-analysis of 10 included studies that provided confusion matrix results.

RESULTS

Various CXR data sets have been used in the included studies, with 2 of the most popular ones being Montgomery County (n=29) and Shenzhen (n=36) data sets. DL (n=34) was more commonly used than ML (n=7) in the included studies. Most studies used human radiologist’s report as the reference standard. Support vector machine (n=5), k-nearest neighbors (n=3), and random forest (n=2) were the most popular ML approaches. Meanwhile, convolutional neural networks were the most commonly used DL techniques, with the 4 most popular applications being ResNet-50 (n=11), VGG-16 (n=8), VGG-19 (n=7), and AlexNet (n=6). Four performance metrics were popularly used, namely, accuracy (n=35), area under the curve (AUC; n=34), sensitivity (n=27), and specificity (n=23). In terms of the performance results, ML showed higher accuracy (mean ~93.71%) and sensitivity (mean ~92.55%), while on average DL models achieved better AUC (mean ~92.12%) and specificity (mean ~91.54%). Based on data from 10 studies that provided confusion matrix results, we estimated the pooled sensitivity and specificity of ML and DL methods to be 0.9857 (95% CI 0.9477-1.00) and 0.9805 (95% CI 0.9255-1.00), respectively. From the risk of bias assessment, 17 studies were regarded as having unclear risks for the reference standard aspect and 6 studies were regarded as having unclear risks for the flow and timing aspect. Only 2 included studies had built applications based on the proposed solutions.

CONCLUSIONS

Findings from this SLR confirm the high potential of both ML and DL for TB detection using CXR. Future studies need to pay a close attention on 2 aspects of risk of bias, namely, the reference standard and the flow and timing aspects.

CLINICALTRIAL

PROSPERO CRD42021277155; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=277155

Publisher

JMIR Publications Inc.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3