Iterative deletion of gene trees detects extreme biases in distance-based phylogenomic coalescent analyses

Author:

Gatesy John,Sloan Daniel B.ORCID,Warren Jessica M.ORCID,Simmons Mark P.,Springer Mark S.

Abstract

AbstractSummary coalescent methods offer an alternative to the concatenation (supermatrix) approach for inferring phylogenetic relationships from genome-scale datasets. Given huge datasets, broad congruence between contrasting phylogenomic paradigms is often obtained, but empirical studies commonly show some well supported conflicts between concatenation and coalescence results and also between species trees estimated from alternative coalescent methods. Partitioned support indices can help arbitrate these discrepancies by pinpointing outlier loci that are unjustifiably influential at conflicting nodes. Partitioned coalescence support (PCS) recently was developed for summary coalescent methods, such as ASTRAL and MP-EST, that use the summed fits of individual gene trees to estimate the species tree. However, PCS cannot be implemented when distance-based coalescent methods (e.g., STAR, NJst, ASTRID, STEAC) are applied. Here, this deficiency is addressed by automating computation of ‘partitioned coalescent branch length’ (PCBL), a novel index that uses iterative removal of individual gene trees to assess the impact of each gene on every clade in a distance-based coalescent tree. Reanalyses of five phylogenomic datasets show that PCBL for STAR and NJst trees helps quantify the overall stability/instability of clades and clarifies disagreements with results from optimality-based coalescent analyses. PCBL scores reveal severe ‘missing taxa’, ‘apical nesting’, ‘misrooting’, and ‘basal dragdown’ biases. Contrived examples demonstrate the gross overweighting of outlier gene trees that drives these biases. Because of interrelated biases revealed by PCBL scores, caution should be exercised when using STAR and NJst, in particular when many taxa are analyzed, missing data are non-randomly distributed, and widespread gene-tree reconstruction error is suspected. Similar biases in the optimality-based coalescent method MP-EST indicate that congruence among species trees estimated via STAR, NJst, and MP-EST should not be interpreted as independent corroboration for phylogenetic relationships. Such agreements among methods instead might be due to the common defects of all three summary coalescent methods.

Publisher

Cold Spring Harbor Laboratory

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3