Abstract
AbstractIncF plasmids are diverse mobile genetic elements found in bacteria from the Enterobacteriaceae family and often carry critical antibiotic and virulence gene cargo. The classification of IncF plasmids using the plasmid Multi-Locus Sequence Typing (pMLST) tool compares the sequences of IncF alleles against a database to create a plasmid sequence type (ST). Accurate identification of plasmid STs is epidemiologically useful because it enables an assessment of the crucial IncF plasmid lineages associated with pandemic and emerging enterobacterial sequence types, inferring important information about specific bacterial lineages. Our initial observations showed discrepancies in IncF allele variants reported by pMLST in a collection of 898Escherichia coliST131 genomes. To evaluate the limitations of the pMLST tool, we interrogated an in-house and publicly available repository of 70324E. coligenomes of various STs and other Enterobacterales genomes (n=1247). All short-read genomes and representatives selected for long-read sequencing were used to assess allele variants and to compare the output with the real biological situation. When multiple allele variants occurred in a single bacterial genome, the python and web versions of the tool randomly selected one allele to report, leading to limited and inaccurate ST identification. Discrepancies were detected in 5804 of 72469 genomes (8.01%). Long read sequencing of 27 carefully selected genomes confirmed multiple IncF allele variants present on one plasmid, or two separate IncF plasmids present in a single bacterial cell. The pMLST tool was unable to accurately distinguish allele variants and their location on replicons using short-read nor long-read genome sequences.ImportancePlasmid sequence type is crucial for describing IncF plasmids due to their capacity to carry important antibiotic and virulence gene cargo and consequently due to their association with disease-causing enterobacterial lineages exhibiting resistance to clinically relevant antibiotics in humans and food animals. As a result, precise reporting of IncF allele variants in IncF plasmids is necessary. Comparison of the FAB formulae generated by the plasmid Multi-Locus Sequence Typing (pMLST) tool with annotated long-read genome sequences identified inconsistencies, including examples where multiple IncF allele variants were present on the same plasmid but missing in the FAB formula, or in cases where two IncF plasmids were detected in one bacterial cell and the pMLST output provided information only about one plasmid. Such inconsistencies may cloud interpretation of IncF plasmid replicon type in specific bacterial lineages or inaccurate assumptions of host strain clonality.
Publisher
Cold Spring Harbor Laboratory