Comparing orthology methods and their performance by recapitulating patterns of eukaryotic genome evolution

Author:

Deutekom Eva S.ORCID,Snel BerendORCID,van Dam Teunis J.P.ORCID

Abstract

AbstractInsights into the evolution of ancestral complexes and pathways are generally achieved through careful and time-intensive manual analysis often using phylogenetic profiles of the constituent proteins. This manual analysis limits the possibility of including more protein-complex components, repeating the analyses for updated genome sets, or expanding the analyses to larger scales. Automated orthology inference should allow such large scale analyses, but substantial differences between orthologous groups generated by different approaches are observed.We evaluate orthology methods for their ability to recapitulate a number of observations that have been made with regards to genome evolution in eukaryotes. Specifically, we investigate phylogenetic profile similarity (co-occurrence of complexes), the Last Eukaryotic Common Ancestor’s gene content, pervasiveness of gene loss, and the overlap with manually determined orthologous groups. Moreover, we compare the inferred orthologies to each other.We find that most orthology methods reconstruct a large Last Eukaryotic Common Ancestor, with substantial gene loss, and can predict interacting proteins reasonably well when applying phylogenetic co-occurrence. At the same time derived orthologous groups show imperfect overlap with manually curated orthologous groups. There is no strong indication of which orthology method performs better than another on individual or all of these aspects. Counterintuitively, despite the orthology methods behaving similarly regarding large scale evaluation, the obtained orthologous groups differ vastly from one another.Availability and implementationThe data and code underlying this article are available in github and/or upon reasonable request to the corresponding author: https://github.com/ESDeutekom/ComparingOrthologies.SummaryWe compared multiple orthology inference methods by looking at how well they perform in recapitulating multiple observations made in eukaryotic genome evolution.Co-occurrence of proteins is predicted fairly well by most methods and all show similar behaviour when looking at loss numbers and dynamics.All the methods show imperfect overlap when compared to manually curated orthologous groups and when compared to orthologous groups of the other methods.Differences are compared between methods by looking at how the inferred orthologies represent a high-quality set of manually curated orthologous groups.We conclude that all methods behave similar when describing general patterns in eukaryotic genome evolution. However, there are large differences within the orthologies themselves, arising from how a method can differentiate between distant homology, recent duplications, or classifying orthologous groups.

Publisher

Cold Spring Harbor Laboratory

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3