Abstract
AbstractIn this study, more than one hundred thousand Escherichia coli and Shigella genomes were examined and classified. This is, to our knowledge, the largest E. coli genome dataset analyzed to date. A Mash-based analysis of a cleaned set of 10,667 E. coli genomes from GenBank revealed 14 distinct phylogroups. A representative genome or medoid identified for each phylogroup was used as a proxy to classify 95,525 unassembled genomes from the Sequence Read Archive (SRA). We find that most of the sequenced E. coli genomes belong to four phylogroups (A, C, B1 and E2(O157)). Authenticity of the 14 phylogroups is supported by several different lines of evidence: phylogroup-specific core genes, a phylogenetic tree constructed with 2613 single copy core genes, and differences in the rates of gene gain/loss/duplication. The methodology used in this work is able to reproduce known phylogroups, as well as to identify previously uncharacterized phylogroups in E. coli species.
Funder
U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences
Arkansas Research Alliance
Publisher
Springer Science and Business Media LLC
Subject
General Agricultural and Biological Sciences,General Biochemistry, Genetics and Molecular Biology,Medicine (miscellaneous)
Reference56 articles.
1. Jang, J. et al. Environmental Escherichia coli: ecology and public health implications-a review. J. Appl. Microbiol. 123, 570–581 (2017).
2. Alm, E. W., Walk, S. T. & Gordon, D. M. in Population Genetics of Bacteria. 69–89, https://doi.org/10.1128/9781555817114.ch6 (Wiley, 2011).
3. Lan, R. & Reeves, P. R. Escherichia coli in disguise: molecular origins of Shigella. Microbes Infect. 4, 1125–1132 (2002).
4. Fischer Walker, C. L., Sack, D. & Black, R. E. Etiology of diarrhea in older children, adolescents and adults: a systematic review. PLoS Negl. Trop. Dis. 4, e768 (2010).
5. Dunne, K. A. et al. Sequencing a piece of history: complete genome sequence of the original Escherichia coli strain. Microb. Genom. 3, mgen000106 (2017).
Cited by
59 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献