Abstract
AbstractIncomplete lineage sorting (ILS) and introgression increase genealogical discordance across the genome, which complicates phylogenetic inference. In such cases, identifying orthologs that result in gene trees with low estimation error is crucial because phylogenomic methods rely on accurate gene histories. We sequenced whole genomes of tinamous (Aves: Tinamidae) to reconstruct their interrelationships and dissect the sources of gene tree and species-tree discordance. We compared results based on five ortholog sets: (1) coding genes (BUSCOs), (2) ultraconserved elements (UCEs) with short flanking regions, (3) UCEs with intermediate flanks, (4) UCEs with long flanks, and (5) UCEs mapped to the Z-chromosome. We hypothesized that orthologs with more phylogenetically informative sites would result in more accurate species trees because the resulting gene trees contain lower stochastic error. Consistent with our hypothesis (and a large body of theory), we found that long UCEs had the most informative sites and lowest rates of error. Surprisingly, BUSCO gene trees contained high error compared to long UCEs, despite having many informative sites. Unlike UCEs, BUSCO gene sequences showed a positive association between the proportion of informative sites and gene tree error. Thus, the underlying properties of molecular evolution differ between BUSCO and UCE datasets, and these differences should be considered when selecting loci for phylogenomic analysis. Importantly, these results indicate stochastic error is not driving inaccurate gene tree estimation for BUSCO loci, instead suggesting a more problematic impact of systematic error in this data-type. Still, species trees from different datasets were mostly congruent. Only one clade, which has a history of ILS and introgression, exhibited substantial species-tree discordance across the different data sets. We suggest that agreement between the Z-chromosome dataset and that of long UCEs lends support to this topology because the Z-chromosome is expected to contain low rates of ILS and faster coalescent times due its relatively smaller effective population size. Overall, we present the most complete phylogeny for tinamous to date, identify an unrecognized species, and provide a case study for species-level phylogenomic analysis using whole-genomes.
Publisher
Cold Spring Harbor Laboratory
Reference100 articles.
1. Alaei Kakhki N. , Schweizer M. , Lutgen D. , Bowie R.C.K. , Shirihai H. , Suh A. , Schielzeth H. , Burri R . 2023. A Phylogenomic Assessment of Processes Underpinning Convergent Evolution in Open-Habitat Chats. Mol. Biol. Evol. 40.
2. Phylogenomics from Whole Genome Sequences Using aTRAM;Syst. Biol,2017
3. MSCquartets: analyzing gene tree quartets under the multi-species coalescent;R package version,2023
4. The evolution of tinamous (Palaeognathae: Tinamidae) in light of molecular and combined analyses;Zool. J. Linn. Soc,2022
5. Aerobic performance in tinamous is limited by their small heart. A novel hypothesis in the evolution of avian flight