Abstract
AbstractProgression analysis of disease (PAD) is a methodology that incorporates the output of Disease-Specific Genomic Analyses (DSGA) to an unsupervised classification scheme based on Topological Data Analysis (TDA). PAD makes use of data derived from healthy individuals to split individual diseased samples into healthy and disease components. Then, the shape characteristics of the disease component are extracted trough the generation of a combinatioral graph by means of the Mapper algorithm. In this paper we introduce a new filtering function for the Mapper algorithm that naturally integrates information on genes linked to disease-free or overall survival. We propose a new PAD-extended methodology termed Progression Analysis of Disease with Survival (PAD-S) and implement it in an R package called SurvMap which allows users to carry out all the steps involved in PAD-S, as well as in traditional PAD analyses. We tested PAD-S methodology using SurvMap on a large combined transcriptomics breast cancer dataset demonstrating its capacity to identify sets of samples displaying highly significant differences in terms of disease free survival (p = 8 × 10−14) and idiosyncratic biological features. PAD-S and SurvMap were also able to identify sets of samples with significantly different relapse-free survivals and molecular profiles inside breast cancer intrinsic subgroups (luminal A, luminal B, Her2, and basal). Finally, to illustrate that PAD-S and SurvMap are general-purpose analysis tools that can be applied to different types of omics data, we also carried out analyses in a breast cancer methylation dataset derived from The Cancer Genome Atlas (TCGA) identifying groups of patients with significant differences in terms of overall survival and methylation profiles.
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献