APPLES: Scalable Distance-based Phylogenetic Placement with or without Alignments-Reference-Cited by-同舟云学术

APPLES: Scalable Distance-based Phylogenetic Placement with or without Alignments

Published:2018-11-23 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Balaban Metin,Sarmashghi Shahab,Mirarab Siavash

Abstract

AbstractPlacing a new species on an existing phylogeny has increasing relevance to several applications. Placement can be used to update phylogenies in a scalable fashion and can help identify unknown query samples using (meta-)barcoding, skimming, or metagenomic data. Maximum likelihood (ML) methods of phylogenetic placement exist, but these methods are not scalable to reference trees with many thousands of leaves, limiting their ability to enjoy benefits of dense taxon sampling in modern reference libraries. They also rely on assembled sequences for the reference set and aligned sequences for the query. Thus, ML methods cannot analyze datasets where the reference consists of unassembled reads, a scenario relevant to emerging applications of genome-skimming for sample identification. We introduce APPLES, a distance-based method for phylogenetic placement. Compared to ML, APPLES is an order of magnitude faster and more memory efficient, and unlike ML, it is able to place on large backbone trees (tested for up to 200,000 leaves). We show that using dense references improves accuracy substantially so that APPLES on dense trees is more accurate than ML on sparser trees, where it can run. Finally, APPLES can accurately identify samples without assembled reference or aligned queries using kmer-based distances, a scenario that ML cannot handle. APPLES is available publically at github.com/balabanmetin/apples.

Publisher

Cold Spring Harbor Laboratory

Reference72 articles.

1. EPA-ng: Massively Parallel Evolutionary Placement of Genetic Sequences;Systematic Biology,2019

2. Multiple comparative metagenomics using multiset k-mer counting;PeerJ Computer Science,2016

3. Aligning short reads to reference alignments and trees

4. Performance, Accuracy, and Web Server for Evolutionary Placement of Short Sequence Reads under Maximum Likelihood

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Collective and harmonized high throughput barcoding of insular arthropod biodiversity: Toward a Genomic Observatories Network for islands;Molecular Ecology;2022-09-25

2. Estimating repeat spectra and genome length from low-coverage genome skims with RESPECT;2021-01-29

3. Forcing external constraints on tree inference using ASTRAL;BMC Genomics;2020-04