Similarity Network Fusion Based on Random Walk and Relative Entropy for Cancer Subtype Prediction of Multigenomic Data-Reference-Cited by-同舟云学术

Similarity Network Fusion Based on Random Walk and Relative Entropy for Cancer Subtype Prediction of Multigenomic Data

Published:2021-08-18 Issue: Volume:2021 Page:1-11
ISSN:1875-919X
Container-title:Scientific Programming
language:en
Short-container-title:Scientific Programming

Author:

Liu Jian¹²^ORCID,Liu Wenfeng³,Cheng Yuhu¹²,Ge Shuguang¹²,Wang Xuesong¹²^ORCID

Affiliation:

1. School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China

2. Engineering Research Center of Intelligent Control for Underground Space, Ministry of Education, China University of Mining and Technology, Xuzhou 221116, China

3. Department of Information Center, Weihai Ocean Vocational College, Rongcheng 264300, China

Abstract

It is a crucial task to design an integrated method to discover cancer subtypes and understand the heterogeneity of cancer based on multiple genomic data. In recent years, some clustering algorithms have been proposed and applied to cancer subtype prediction. Among them, similarity network fusion (SNF) can integrate multiple types of genomic data to identify cancer subtypes, which improves the understanding of tumorigenesis. SNF uses a dense similarity matrix to obtain the global information of the data, and the interconnection of samples between different categories will cause noise interference. Therefore, how to construct a more robust dense similarity matrix is an important research content to improve the performance of cancer subtype identification. In this paper, we proposed similarity network fusion based on random walk and relative entropy (R2SNF) for cancer subtype prediction. Firstly, the random walk algorithm was used to capture the complex relationship between samples in each genomic data. And the transition probability distribution of samples in the network was obtained. If two samples belong to the same class, the transition probability between the two samples is great. On the contrary, if the two samples do not belong to the same class, the transition probability between the two samples is small. In this way, the degree of correlation between samples can be well obtained, thereby reducing the noise interference caused by the interconnection of samples between different categories. Secondly, relative entropy was used to calculate the difference in the transition probability distribution between samples to construct a better dense similarity matrix which contains structural similarity information between samples. Thirdly, we iteratively fused the obtained dense similarity matrix with the KNN similarity matrix to construct the fused similarity matrix of all genomic data. Finally, by using spectral clustering, the fused similarity matrix was grouped into multiple clusters, which indicates the cancer subtypes. Experiments on seven cancer omics datasets show that the R2SNF algorithm performs well in identifying cancer subtypes.

Funder

National Natural Science Foundation of China

Publisher

Hindawi Limited

Subject

Computer Science Applications,Software

Link

http://downloads.hindawi.com/journals/sp/2021/2292703.pdf

Reference37 articles.

1. Integrated genomic characterization of endometrial carcinoma

2. Inferring subgroup-specific driver genes from heterogeneous cancer samples via subspace learning with subgroup indication

3. Oncogenes and Cancer

4. Detection of Urothelial Bladder Carcinoma via Microfluidic Immunoassay and Single-Cell DNA Copy-Number Alteration Analysis of Captured Urinary-Exfoliated Tumor Cells

5. DNA methylation-based classification of central nervous system tumours

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Network medicine for patients' stratification: From single‐layer to multi‐omics;WIREs Mechanisms of Disease;2023-06-15