A comparative study of clustering methods on gene expression data for lung cancer prognosis-Reference-Cited by-同舟云学术

A comparative study of clustering methods on gene expression data for lung cancer prognosis

Published:2023-11-08 Issue:1 Volume:16 Page:
ISSN:1756-0500
Container-title:BMC Research Notes
language:en
Short-container-title:BMC Res Notes

Author:

Zhang Jason Z.,Wang Chi

Abstract

AbstractLung cancer subtyping based on gene expression data is important for identifying patient subgroups with differing survival prognosis to facilitate customized treatment strategies for each subtype of patients. Unsupervised clustering methods are the traditional approach for clustering patients into subtypes. However, since those methods cluster patients based only on gene expression data, the resulting clusters may not always be relevant to the survival outcome of interest. In recent years, semi-supervised and supervised methods have been proposed, which leverage the survival outcome data to identify clusters more relevant to survival prognosis. This paper aims to compare the performance of different clustering methods for identifying clinically prognostic lung cancer subtypes based on two lung adenocarcinoma datasets. For each method, we clustered patients into two clusters and assessed the difference in patient survival time between clusters. Unsupervised methods were found to have large logrank p-values and no significant results in most cases. Semi-supervised and supervised methods had improved performance over unsupervised methods and very significant p-values. These results indicate that unsupervised methods are not capable of identifying clusters with significant differences in survival prognosis in most cases, while supervised and semi-supervised methods can better cluster patients into clinically useful subtypes.

Funder

Biostatistics and Bioinformatics Shared Resource Facilities of the University of Kentucky Markey Cancer Center

National Institutes of Health

Publisher

Springer Science and Business Media LLC

Subject

General Biochemistry, Genetics and Molecular Biology,General Medicine

Link

https://link.springer.com/content/pdf/10.1186/s13104-023-06604-8.pdf

Reference28 articles.

1. Spiro SG, Silvestri GA. One hundred years of Lung cancer. Am J Respir Crit Care Med. 2005;172(5):523–9.

2. Hayes DN, Monti S, Parmigiani G, Gilks CB, Naoki K, Bhattacharjee A, et al. Gene expression profiling reveals reproducible human lung adenocarcinoma subtypes in multiple Independent patient cohorts. J Clin Oncol. 2006;24(31):5079–90.

3. Bhattacharjee A, Richards WG, Staunton J, Li C, Monti S, Vasa P et al. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proceedings of the National Academy of Sciences. 2001;98(24):13790-5.

4. Lu Y, Lemon W, Liu P-Y, Yi Y, Morrison C, Yang P, et al. A gene expression signature predicts survival of patients with stage I non-small cell Lung cancer. PLoS Med. 2006;3(12):e467.

5. Perou CM, Sørlie T, Eisen MB, Van De Rijn M, Jeffrey SS, Rees CA, et al. Molecular portraits of human breast tumours. Nature. 2000;406(6797):747–52.