Abstract
AbstractH37Rv is the most widely used M. tuberculosis strain. Its genome is globally used as the M. tuberculosis reference sequence. We developed Bact-Builder, a pipeline that leverages consensus building to generate complete and highly accurate gap-closed bacterial genomes and applied it to three independently sequenced cultures of a parental H37Rv laboratory stock. Two of the 4,417,942 base-pair long H37Rv assemblies were 100% identical, with the third differing by a single nucleotide. Compared to the existing H37Rv reference, the new sequence contained approximately 6.4 kb additional base pairs encoding ten new regions. These regions included insertions in PE/PPE genes and new paralogs of esxN and esxJ, which were differentially expressed compared to the reference genes. Additional sequencing and assembly with Bact-Builder confirmed that all 10 regions were also present in widely accepted strains of H37Rv: NR123 and TMC102. Bact-builder shows promise as an improved method to perform extremely accurate and reproducible de novo assemblies of bacterial genomes. Furthermore, our findings provide important updates to the primary tuberculosis reference genome.
Publisher
Cold Spring Harbor Laboratory
Reference69 articles.
1. Global tuberculosis report 2020. https://www.who.int/publications/i/item/9789240013131.
2. Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence
3. Designation of Strain H37Rv as the Neotype of Mycobacterium tuberculosis
4. Steenken, W. , Oatway, W. H. & Petroff, S. A. BIOLOGICAL STUDIES OF THE TUBERCLE BACILLUS III. DISSOCn∼TION AND PATHOGENICITY OF THE R AND S VARIANTS OF TItE HU∼C-AN TUBERCLE BACILLUS (I∼7)*.
5. Variation among Genome Sequences of H37Rv Strains of
Mycobacterium tuberculosis
from Multiple Laboratories