DeepNOG: fast and accurate protein orthologous group assignment

Author:

Feldbauer Roman1ORCID,Gosch Lukas1,Lüftinger Lukas12,Hyden Patrick1,Flexer Arthur3,Rattei Thomas1ORCID

Affiliation:

1. Department of Microbiology and Ecosystem Science, University of Vienna, Vienna 1090, Austria

2. Ares Genetics GmbH, Vienna 1030, Austria

3. Institute of Computational Perception, Johannes Kepler University Linz, Linz 4040, Austria

Abstract

Abstract Motivation Protein orthologous group databases are powerful tools for evolutionary analysis, functional annotation or metabolic pathway modeling across lineages. Sequences are typically assigned to orthologous groups with alignment-based methods, such as profile hidden Markov models, which have become a computational bottleneck. Results We present DeepNOG, an extremely fast and accurate, alignment-free orthology assignment method based on deep convolutional networks. We compare DeepNOG against state-of-the-art alignment-based (HMMER, DIAMOND) and alignment-free methods (DeepFam) on two orthology databases (COG, eggNOG 5). DeepNOG can be scaled to large orthology databases like eggNOG, for which it outperforms DeepFam in terms of precision and recall by large margins. While alignment-based methods still provide the most accurate assignments among the investigated methods, computing time of DeepNOG is an order of magnitude lower on CPUs. Optional GPU usage further increases throughput massively. A command-line tool enables rapid adoption by users. Availabilityand implementation Source code and packages are freely available at https://github.com/univieCUBE/deepnog. Install the platform-independent Python program with $pip install deepnog. Supplementary information Supplementary data are available at Bioinformatics online.

Funder

Austrian Science Fund

GPU

Nvidia corporation

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Reference43 articles.

1. Clustering with deep learning: taxonomy and new methods;Aljalbout;arXiv e-Prints, Abs/1801.07648,2018

2. The OMA orthology database in 2018: retrieving evolutionary relationships among all domains of life through richer web and programmatic interfaces;Altenhoff;Nucleic Acids Res,2018

3. Principles that govern the folding of protein chains;Anfinsen;Science,1973

4. Reconciling modern machine learning practice and the bias-variance trade-off;Belkin;arXiv e-Prints,2018

5. Human gut microbiome: hopes, threats and promises;Cani;Gut,2018

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3