Benign‐malignant classification of pulmonary nodules by low‐dose spiral computerized tomography and clinical data with machine learning in opportunistic screening

Author:

Zheng Yansong1ORCID,Dong Jing2ORCID,Yang Xue1ORCID,Shuai Ping3ORCID,Li Yongli4ORCID,Li Hailin56ORCID,Dong Shengyong1ORCID,Gong Yan1ORCID,Liu Miao7ORCID,Zeng Qiang1ORCID

Affiliation:

1. Department of Health Medicine Second Medical Center & National Clinical Research Center for Geriatric Diseases Chinese People's Liberation Army General Hospital Beijing China

2. Research of Medical Big Data Center & National Engineering Laboratory for Medical Big Data Application Technology Chinese PLA General Hospital Beijing China

3. Health Management Center Sichuan Provincial People's Hospital University of Electronic Science and Technology of China Chengdu China

4. Department of Health Management/ Henan Provincial People's Hospital of Zhengzhou University Henan Key Laboratory of Chronic Disease Management Zhengzhou China

5. Beijing Advanced Innovation Center for Big Data‐Based Precision Medicine School of Medicine and Engineering Beihang University Beijing China

6. CAS Key Laboratory of Molecular Imaging Institute of Automation Beijing China

7. Graduate School Chinese PLA general hospital Beijing China

Abstract

AbstractBackgroundMany people were found with pulmonary nodules during physical examinations. It is of great practical significance to discriminate benign and malignant nodules by using data mining technology.MethodsThe subjects' demographic data, baseline examination results, and annual follow‐up low‐dose spiral computerized tomography (LDCT) results were recorded. The findings from annual physical examinations of positive nodules, including highly suspicious nodules and clinically tentative benign nodules, was analyzed. The extreme gradient boosting (XGBoost) model was constructed and the Grid Search CV method was used to select the super parameters. External unit data were used as an external validation set to evaluate the generalization performance of the model.ResultsA total of 135,503 physical examinees were enrolled. Baseline testing found that 27,636 (20.40%) participants had clinically tentative benign nodules and 611 (0.45%) participants had highly suspicious nodules. The proportion of highly suspicious nodules in participants with negative baseline was about 0.12%–0.46%, which was lower than the baseline level except the follow‐up of >5 years. In the 27,636 participants with clinically tentative benign nodules, only in the first year of LDCT re‐examination was the proportion of highly suspicious nodules (1.40%) significantly greater than that of baseline screening (0.45%) (p < 0.001), and the proportion of highly suspicious nodules was not different between the baseline screening and other follow‐up years (p > 0.05). Furthermore, 322 cases with benign nodules and 196 patients with malignant nodules confirmed by surgery and pathology were compared. A model and the top 15 most important clinical variables were determined by XGBoost algorithm. The area under the curve (AUC) of the model was 0.76 [95% CI: 0.67–0.84], and the accuracy was 0.75. The sensitivity and specificity of the model under this threshold were 0.78 and 0.73, respectively. In the validation of model using external data, the AUC was 0.87 and the accuracy was 0.80. The sensitivity and specificity were 0.83 and 0.77, respectively.ConclusionsIt is important that pulmonary nodules could be more accurately identified at the first LDCT examination. A model with 15 variables which are routinely measured in the clinic could be helpful to distinguish benign and malignant nodules. It could help the radiological team issue a more accurate report; and it may guide the clinical team regarding LDCT follow‐up.

Publisher

Wiley

Subject

Cancer Research,Radiology, Nuclear Medicine and imaging,Oncology

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3