The Machine Learning Model for Distinguishing Pathological Subtypes of Non-Small Cell Lung Cancer

Author:

Zhao Hongyue,Su Yexin,Wang Mengjiao,Lyu Zhehao,Xu Peng,Jiao Yuying,Zhang Linhan,Han Wei,Tian Lin,Fu Peng

Abstract

PurposeMachine learning models were developed and validated to identify lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) using clinical factors, laboratory metrics, and 2-deoxy-2[18F]fluoro-D-glucose ([18F]F-FDG) positron emission tomography (PET)/computed tomography (CT) radiomic features.MethodsOne hundred and twenty non-small cell lung cancer (NSCLC) patients (62 LUAD and 58 LUSC) were analyzed retrospectively and randomized into a training group (n = 85) and validation group (n = 35). A total of 99 feature parameters—four clinical factors, four laboratory indicators, and 91 [18F]F-FDG PET/CT radiomic features—were used for data analysis and model construction. The Boruta algorithm was used to screen the features. The retained minimum optimal feature subset was input into ten machine learning to construct a classifier for distinguishing between LUAD and LUSC. Univariate and multivariate analyses were used to identify the independent risk factors of the NSCLC subtype and constructed the Clinical model. Finally, the area under the receiver operating characteristic curve (AUC) values, sensitivity, specificity, and accuracy (ACC) was used to validate the machine learning model with the best performance effect and Clinical model in the validation group, and the DeLong test was used to compare the model performance.ResultsBoruta algorithm selected the optimal subset consisting of 13 features, including two clinical features, two laboratory indicators, and nine PEF/CT radiomic features. The Random Forest (RF) model and Support Vector Machine (SVM) model in the training group showed the best performance. Gender (P=0.018) and smoking status (P=0.011) construct the Clinical model. In the validation group, the SVM model (AUC: 0.876, ACC: 0.800) and RF model (AUC: 0.863, ACC: 0.800) performed well, while Clinical model (AUC:0.712, ACC: 0.686) performed moderately. There was no significant difference between the RF and Clinical models, but the SVM model was significantly better than the Clinical model. ConclusionsThe proposed SVM and RF models successfully identified LUAD and LUSC. The results indicate that the proposed model is an accurate and noninvasive predictive tool that can assist clinical decision-making, especially for patients who cannot have biopsies or where a biopsy fails.

Publisher

Frontiers Media SA

Subject

Cancer Research,Oncology

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3