Affiliation:
1. School of Information and Data Sciences Nagasaki University Nagasaki Japan
2. Laboratory for Bioinformatics Research RIKEN Center for Biosystems Dynamics Research Saitama Japan
3. Department of Integrated Biosciences, GraduateSchool of Frontier Sciences The University of Tokyo Chiba Japan
Abstract
Abstract
With the determination of numerous viral and bacterial genome sequences, sequence‐trait relationships, such as the evolution of virulence and associations to geographic location or host, are now being studied. In these studies, phylogenetic trees were first reconstructed, and trait data were analysed based on the trees. However, in some cases, such as fast evolution sequences and gene‐sharing network data, reconstructing the phylogenetic tree is challenging. Even in such cases, it is possible to quantify the similarity between sequences and construct an similarity network.
Here, we propose a novel approach, Network‐Trait association with Graph Fourier Transform (NeTaGFT), to analyse network‐trait associations. NeTaGFT is inspired by graph signalling process techniques. The graph in this study corresponds to a similarity network representing the similarities between virus samples, and the graph signal corresponds to trait data. By using graph Fourier transform, NeTaGFT aims to identify trait signals and associations of various traits from a similarity network.
We validated that NeTaGFT can find signals associated with network structures and associations of traits with the simulation dataset. We applied NeTaGFT for influenza type A and virome gene‐sharing datasets. As a result, we identified several network structures and their associated traits.
Our approach is expected to provide novel insights into network‐based approach not only for typical sequence‐trait relationships but also for various biological data, such as antibody evolution.
Funder
Japan Society for the Promotion of Science
Subject
Ecological Modeling,Ecology, Evolution, Behavior and Systematics