Abstract
ABSTRACTGenome sequencing technologies reveal a huge amount of genomic sequences. Neural network-based methods can be prime candidates for retrieving insights from these sequences because of their applicability to large and diverse datasets.However, the highly variable lengths of nucleic acid sequences severely impair the presentation of sequences as input to the neural network. Genetic variations further complicate tasks that involve sequence comparison or alignment. Here, we propose a graph representation of nucleic acid sequences calledgapped pattern graphs. These graphs can be transformed through a Graph Convolutional Network to form lower-dimensional embeddings for downstream tasks. On the basis of the gapped pattern graphs, we implemented a neural network model and demonstrated its performance in studying phage sequences. We compared our model with equivalent models based on other forms of input in performing four tasks related to nucleic acid sequences—phage and ICE discrimination, phage integration site prediction, lifestyle prediction, and host prediction. Other state-of-the-art tools were also compared, where available. Our method consistently outperformed all the other methods in various metrics on all four tasks. In addition, our model was able to identify distinct gapped pattern signatures from the sequences.
Publisher
Cold Spring Harbor Laboratory
Reference50 articles.
1. Rapid transcriptional plasticity of duplicated gene clusters enables a clonally reproducing aphid to colonise diverse plant species
2. Modeling in-vivo protein-DNA binding by combining multiple-instance learning with a hybrid deep neural network;Sci. Reports,2019
3. Multiple spaced seeds for homology search
4. Battaglia, P. et al. Relational inductive biases, deep learning, and graph networks. arXiv (2018).
5. Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations (ICLR) (2017).
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献