NetSHy: network summarization via a hybrid approach leveraging topological properties

Author:

Vu Thao1ORCID,Litkowski Elizabeth M23,Liu Weixuan1,Pratte Katherine A4,Lange Leslie3,Bowler Russell P5,Banaei-Kashani Farnoush6,Kechris Katerina J1ORCID

Affiliation:

1. Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus , Aurora, CO 80045, USA

2. Department of Epidemiology, University of Colorado Anschutz Medical Campus , Aurora, CO 80045, USA

3. Division of Biomedical Informatics & Personalized Medicine, School of Medicine, Colorado University Anschutz Medical Campus , Aurora, CO 80045, USA

4. Department of Biostatistics, National Jewish Health , Denver, CO 80206, USA

5. Division of Pulmonary Medicine, Department of Medicine, National Jewish Health , Denver, CO 80206, USA

6. Department of Computer Science and Engineering, College of Engineering, Design and Computing, University of Colorado Denver , Denver, CO 80204, USA

Abstract

Abstract Motivation Biological networks can provide a system-level understanding of underlying processes. In many contexts, networks have a high degree of modularity, i.e. they consist of subsets of nodes, often known as subnetworks or modules, which are highly interconnected and may perform separate functions. In order to perform subsequent analyses to investigate the association between the identified module and a variable of interest, a module summarization, that best explains the module’s information and reduces dimensionality is often needed. Conventional approaches for obtaining network representation typically rely only on the profiles of the nodes within the network while disregarding the inherent network topological information. Results In this article, we propose NetSHy, a hybrid approach which is capable of reducing the dimension of a network while incorporating topological properties to aid the interpretation of the downstream analyses. In particular, NetSHy applies principal component analysis (PCA) on a combination of the node profiles and the well-known Laplacian matrix derived directly from the network similarity matrix to extract a summarization at a subject level. Simulation scenarios based on random and empirical networks at varying network sizes and sparsity levels show that NetSHy outperforms the conventional PCA approach applied directly on node profiles, in terms of recovering the true correlation with a phenotype of interest and maintaining a higher amount of explained variation in the data when networks are relatively sparse. The robustness of NetSHy is also demonstrated by a more consistent correlation with the observed phenotype as the sample size decreases. Lastly, a genome-wide association study is performed as an application of a downstream analysis, where NetSHy summarization scores on the biological networks identify more significant single nucleotide polymorphisms than the conventional network representation. Availability and implementation R code implementation of NetSHy is available at https://github.com/thaovu1/NetSHy Supplementary information Supplementary data are available at Bioinformatics online.

Funder

National Institues of Health

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3