Multiomics-Based Feature Extraction and Selection for the Prediction of Lung Cancer Survival

Author:

Jaksik Roman1ORCID,Szumała Kamila2,Dinh Khanh Ngoc3ORCID,Śmieja Jarosław1

Affiliation:

1. Department of Systems Biology and Engineering, Silesian University of Technology, 44-100 Gliwice, Poland

2. Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, 44-100 Gliwice, Poland

3. Irving Institute for Cancer Dynamics and Department of Statistics, Columbia University, New York, NY 10027, USA

Abstract

Lung cancer is a global health challenge, hindered by delayed diagnosis and the disease’s complex molecular landscape. Accurate patient survival prediction is critical, motivating the exploration of various -omics datasets using machine learning methods. Leveraging multi-omics data, this study seeks to enhance the accuracy of survival prediction by proposing new feature extraction techniques combined with unbiased feature selection. Two lung adenocarcinoma multi-omics datasets, originating from the TCGA and CPTAC-3 projects, were employed for this purpose, emphasizing gene expression, methylation, and mutations as the most relevant data sources that provide features for the survival prediction models. Additionally, gene set aggregation was shown to be the most effective feature extraction method for mutation and copy number variation data. Using the TCGA dataset, we identified 32 molecular features that allowed the construction of a 2-year survival prediction model with an AUC of 0.839. The selected features were additionally tested on an independent CPTAC-3 dataset, achieving an AUC of 0.815 in nested cross-validation, which confirmed the robustness of the identified features.

Funder

National Science Centre

Publisher

MDPI AG

Reference73 articles.

1. Non-small-cell lung cancer;Gridelli;Nat. Rev. Dis. Primers,2015

2. Genomics of lung cancer;Borczuk;Proc. Am. Thorac. Soc.,2009

3. Identifying prognostic biomarkers of non-small cell lung cancer by transcriptome analysis;Xiong;Cancer Biomark. Sect. A Dis. Markers,2020

4. Cheung, C.H.Y., and Juan, H.F. (2017). Quantitative proteomics in lung cancer. J. Biomed. Sci., 24.

5. High-resolution metabolomic biomarkers for lung cancer diagnosis and prognosis;Qi;Sci. Rep.,2021

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3