Information‐incorporated sparse hierarchical cancer heterogeneity analysis-Reference-Cited by-同舟云学术

Information‐incorporated sparse hierarchical cancer heterogeneity analysis

Published:2024-03-30 Issue:11 Volume:43 Page:2280-2297
ISSN:0277-6715
Container-title:Statistics in Medicine
language:en
Short-container-title:Statistics in Medicine

Author:

Han Wei¹²^ORCID,Zhang Sanguo¹²,Ma Shuangge³^ORCID,Ren Mingyang⁴^ORCID

Affiliation:

1. School of Mathematical Sciences University of Chinese Academy of Sciences Beijing China

2. Key Laboratory of Big Data Mining and Knowledge Management Chinese Academy of Sciences Beijing China

3. Department of Biostatistics Yale School of Public Health New Haven Connecticut

4. School of Mathematical Sciences Shanghai Jiao Tong University Shanghai China

Abstract

Cancer heterogeneity analysis is essential for precision medicine. Most of the existing heterogeneity analyses only consider a single type of data and ignore the possible sparsity of important features. In cancer clinical practice, it has been suggested that two types of data, pathological imaging and omics data, are commonly collected and can produce hierarchical heterogeneous structures, in which the refined sub‐subgroup structure determined by omics features can be nested in the rough subgroup structure determined by the imaging features. Moreover, sparsity pursuit has extraordinary significance and is more challenging for heterogeneity analysis, because the important features may not be the same in different subgroups, which is ignored by the existing heterogeneity analyses. Fortunately, rich information from previous literature (for example, those deposited in PubMed) can be used to assist feature selection in the present study. Advancing from the existing analyses, in this study, we propose a novel sparse hierarchical heterogeneity analysis framework, which can integrate two types of features and incorporate prior knowledge to improve feature selection. The proposed approach has satisfactory statistical properties and competitive numerical performance. A TCGA real data analysis demonstrates the practical value of our approach in analyzing data heterogeneity and sparsity.

Funder

National Science Foundation of Sri Lanka

National Institutes of Health

National Natural Science Foundation of China

Publisher

Wiley

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1002/sim.10071

Reference26 articles.

1. Non-small-cell lung cancers: a heterogeneous set of diseases

2. Triple-negative breast cancer: challenges and opportunities of a heterogeneous disease

3. Learning individualized treatment rules with many treatments: a supervised clustering approach using adaptive fusion;Ma H;Adv Neural Inf Process Syst,2022

4. Variable Selection in Finite Mixture of Regression Models

5. ℓ1-penalization for mixture regression models