DeepNOG: fast and accurate protein orthologous group assignment-Reference-Cited by-同舟云学术

DeepNOG: fast and accurate protein orthologous group assignment

Published:2020-12-01 Issue:22-23 Volume:36 Page:5304-5312
ISSN:1367-4803
Container-title:Bioinformatics
language:en
Short-container-title:

Author:

Feldbauer Roman¹^ORCID,Gosch Lukas¹,Lüftinger Lukas¹²,Hyden Patrick¹,Flexer Arthur³,Rattei Thomas¹^ORCID

Affiliation:

1. Department of Microbiology and Ecosystem Science, University of Vienna, Vienna 1090, Austria

2. Ares Genetics GmbH, Vienna 1030, Austria

3. Institute of Computational Perception, Johannes Kepler University Linz, Linz 4040, Austria

Abstract

Abstract Motivation Protein orthologous group databases are powerful tools for evolutionary analysis, functional annotation or metabolic pathway modeling across lineages. Sequences are typically assigned to orthologous groups with alignment-based methods, such as profile hidden Markov models, which have become a computational bottleneck. Results We present DeepNOG, an extremely fast and accurate, alignment-free orthology assignment method based on deep convolutional networks. We compare DeepNOG against state-of-the-art alignment-based (HMMER, DIAMOND) and alignment-free methods (DeepFam) on two orthology databases (COG, eggNOG 5). DeepNOG can be scaled to large orthology databases like eggNOG, for which it outperforms DeepFam in terms of precision and recall by large margins. While alignment-based methods still provide the most accurate assignments among the investigated methods, computing time of DeepNOG is an order of magnitude lower on CPUs. Optional GPU usage further increases throughput massively. A command-line tool enables rapid adoption by users. Availabilityand implementation Source code and packages are freely available at https://github.com/univieCUBE/deepnog. Install the platform-independent Python program with $pip install deepnog. Supplementary information Supplementary data are available at Bioinformatics online.

Funder

Austrian Science Fund

GPU

Nvidia corporation

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Link

http://academic.oup.com/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btaa1051/35383740/btaa1051.pdf

Reference43 articles.

1. Clustering with deep learning: taxonomy and new methods;Aljalbout;arXiv e-Prints, Abs/1801.07648,2018

2. The OMA orthology database in 2018: retrieving evolutionary relationships among all domains of life through richer web and programmatic interfaces;Altenhoff;Nucleic Acids Res,2018

3. Principles that govern the folding of protein chains;Anfinsen;Science,1973

4. Reconciling modern machine learning practice and the bias-variance trade-off;Belkin;arXiv e-Prints,2018

5. Human gut microbiome: hopes, threats and promises;Cani;Gut,2018

Cited by 20 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Microbiota profiling in esophageal diseases: Novel insights into molecular staining and clinical outcomes;Computational and Structural Biotechnology Journal;2024-12

2. Whole Genome Sequence Analysis of Brucella spp. from Human, Livestock, and Wildlife in South Africa;Journal of Microbiology;2024-07-22

3. Diel changes in the expression of a marker gene and candidate genes for intracellular amorphous CaCO₃biomineralization inMicrocystis;2024-07-07

4. Decoding the resistome, virulome and mobilome of clinical versus aquatic Acinetobacter baumannii in southern Romania;Heliyon;2024-07

5. Gut virome in inflammatory bowel disease and beyond;Gut;2023-11-10