Abstract
AbstractAlternative splicing (AS) enables the regulated generation of multiple mRNA and protein products from a single gene. Cancer cells have general, cancer type-specific, and subtype-specific alterations in the splicing process that can have predictive value and contribute to cancer diagnosis, prognosis, and treatment. Currently, multi-omics data have been used to identify the molecular subtype of cancer. However, alternative splicing is rarely used to identify the cancer subtypes. Here, we propose a redundancy-reduction contrastive learning-based method (CLCluster) based on copy number variation, DNA methylation, gene expression, miRNA expression, and alternative splicing for cancer subtype clustering of 33 cancer types. Experimental results demonstrate the superior performance of the proposed CLCluster model in identifying cancer subtypes over the currently available state-of-the-art clustering methods. Moreover, ablation experiments demonstrate the advantages of alternative splicing data for cancer subtyping tasks. We performed multiple analyses for cancer subtype-related AS events, including open reading frame annotation, and RNA binding protein-associated alternative splicing regulation. From our analysis, we identified 2,930 AS events that were associated with patient survival, and ORF analysis showed that 417 of them could cause in-frame and 420 could cause frameshift. we also identified 1,752 RBP-AS regulatory pairs that could be associated with patient survival. Accurate classification of the cancer type using CLCluster, and effective annotation of cancer subtype related AS events can effectively facilitate the identification of patient’s therapeutically targetable AS events.
Publisher
Cold Spring Harbor Laboratory