Abstract
The evolutionary histories of different genomic regions typically differ from each other and from the underlying species phylogeny. This makes species tree estimation challenging. Here, we examine the performance of phylogenomic methods using a well-resolved phylogeny that nevertheless contains many difficult nodes, the species tree of living birds. We compared trees generated by maximum likelihood (ML) analysis of concatenated data, gene tree summary methods, and SVDquartets. We also conduct the first empirical test of a “new” method called METAL (Metric algorithm for Estimation of Trees based on Aggregation of Loci), which is based on evolutionary distances calculated using concatenated data. We conducted this test using a novel dataset comprising more than 4,000 ultraconserved element (UCE) loci from almost all bird families and two existing UCE and intron datasets sampled from almost all avian orders. We identified “reliable clades” very likely to be present in the true avian species tree and used them to assess method performance. ML analyses of concatenated data recovered almost all reliable clades with less data and greater robustness to missing data than other methods. METAL recovered many reliable clades, but only performed well with the largest datasets. Gene tree summary methods (weighted ASTRAL and weighted ASTRID) performed well; they required less data than METAL but more data than ML concatenation. SVDquartets exhibited the worst performance of the methods tested. In addition to the methodological insights, this study provides a novel estimate of avian phylogeny with almost 99% of the currently recognized avian families. Only one of the 181 reliable clades we examined was consistently resolved differently by ML concatenation versus other methods, suggesting that it may be possible to achieve consensus on the deep phylogeny of extant birds.
Publisher
Cold Spring Harbor Laboratory