Genealogical inference and more flexible sequence clustering using iterative-PopPUNK-Reference-Cited by-同舟云学术

Genealogical inference and more flexible sequence clustering using iterative-PopPUNK

Published:2023-05-30 Issue:6 Volume:33 Page:988-998
ISSN:1088-9051
Container-title:Genome Research
language:en
Short-container-title:Genome Res.

Author:

Zhao Bin,Lees John A.,Wu Hongjin,Yang Chao,Falush Daniel

Abstract

Bacterial genome data are accumulating at an unprecedented speed due to the routine use of sequencing in clinical diagnoses, public health surveillance, and population genetics studies. Genealogical reconstruction is fundamental to many of these uses; however, inferring genealogy from large-scale genome data sets quickly, accurately, and flexibly is still a challenge. Here, we extend an alignment- and annotation-free method, PopPUNK, to increase its flexibility and interpretability across data sets. Our method, iterative-PopPUNK, rapidly produces multiple consistent cluster assignments across a range of sequence identities. By constructing a partially resolved genealogical tree with respect to these clusters, users can select a resolution most appropriate for their needs. We showed the accuracy of clusters at all levels of similarity and genealogical inference of iterative-PopPUNK based on simulated data and obtained phylogenetically concordant results in real data sets from seven bacterial species. Using two example sets ofEscherichia/ShigellaandVibrio parahaemolyticusgenomes, we show that iterative-PopPUNK can achieve cluster resolutions ranging from phylogroup down to sequence typing (ST). The iterative-PopPUNK algorithm is implemented in the “PopPUNK_iterate” program, available as part of the PopPUNK package.

Funder

National Key Research and Development Program of China

Shanghai Municipal Science and Technology

National Natural Science Foundation of China

Youth Innovation Promotion Association, Chinese Academy of Sciences

Shanghai Rising-Star Program

Medical Research Council

UK Medical Research Council

MRC

UK Department for International Development

DFID

Publisher