Affiliation:
1. East China Normal University
2. NYU-Shanghai
Abstract
Abstract
Ortholog prediction, essential for various genomic research areas, faces growing inconsistencies amidst the expanding array of ortholog databases. The common strategy of computing consensus orthologs introduces additional arbitrariness, underscoring the need to identify proteins prone to ortholog prediction inconsistency. To address this, we introduce the Signal Jaccard Index (SJI), a novel metric based on unsupervised genome context clustering, to assess protein similarity. Utilizing SJI, we construct a protein network, revealing that proteins at the network peripheries primarily contribute to prediction inconsistency. Importantly, we show that a protein's degree centrality can gauge its assignment reliability to a consensus set, facilitating the refinement of ortholog predictions.
Publisher
Research Square Platform LLC
Reference52 articles.
1. Orthologs, paralogs, and evolutionary genomics;Koonin EV;Annu Rev Genet,2005
2. Functional and evolutionary implications of gene orthology;Gabaldon T;Nat Rev Genet,2013
3. Standardized benchmarking in the quest for orthologs;Altenhoff AM;Nat Methods,2016
4. Updates to HCOP: the HGNC comparison of orthology predictions tool;Yates B;Brief Bioinform,2021
5. Getting started in gene orthology and functional analysis;Fang G;PLoS Comput Biol,2010