Abstract
2.AbstractWhole genome reconstruction of bacterial pathogens has become an import tool for tracking antimicrobial resistance spread, however accurate and complete assemblies have only been achievable using hybrid long and short-read sequencing. We have previously found the Oxford Nanopore Technologies (ONT) R10.4/kit12 flowcells produced improved assemblies over the R9.4.1/kit10, however they contained too many errors compared to hybrid Illumina-ONT assemblies. ONT have since released the R10.4.1/kit12 flowcells that promises greater accuracy and yield. They have also released newly trained basecallers using native bacterial DNA containing methylation sites intended to fix systematic errors, specifically Adenosine (A) to Guanine (G) and Cytosine (C) to Thymine (T) substitutions. ONT have recommended the use of Bovine Serum Albumin (BSA) during library preparation to improve sequencing yield and accuracy. To evaluate these improvements, we sequenced DNA extracts from four commonly studied bacterial pathogens, namelyEscherichia coli,Klebsiella pneumoniae,Pseudomonas aeruginosaandStaphylococcus aureus, as well as 12 disparateE. coliclinical samples from different phylogroups and sequence types. These were all sequenced with and without BSA. These sequences werede novoassembled and compared against Illumina-corrected reference genomes. Here we have found the nanopore long read-only R10.4.1 (kit14) assemblies with basecallers trained using native bacterial methylated DNA produce accurate assemblies from 40x depth or higher, sufficient to be cost-effective compared to hybrid long-read (ONT) and short-read (Illumina) sequencing.3.Impact statementCurrently, the best method of building accurate and complete bacterial genome assemblies is to create a hybrid assembly; combining both long and short DNA sequencing reads. Short reads are more accurate, but can be difficult to assemble into a complete genome. Long reads are generally less accurate, but easier to reconstruct into a complete genome. By combining long and short reads, we get both accuracy and reconstructive power. However, this also involves higher costs and more labour than using a single sequencing platform. In this study, we compare long read only assemblies from Oxford Nanopore Technology’s newest iteration of improvements in both chemistry and software to hybrid Illumina-Nanopore assemblies. We sequenced four bacterial pathogens with published reference genomes (Staphylococcus aureus, Klebsiella Pneumoniae, Pseudomonas Aeruginosa, andEscherichia Coli) and twelve bloodstream associatedE. coli, and show that assemblies from the newest technology are not only an improvement on the previous iteration, but are able to compete with hybrid Illumina-Nanopore assemblies in their quality, providing a step towards bacterial genome assembly using a single sequencing platform.4.Data summaryThe authors confirm all supporting data, code and protocols have been provided within the article, through supplementary data files, or in publicly accessible repositories.Nanopore and Illumina fastq data are available in the ENA under project accession: PRJEB51164.Assemblies have been made available at:https://figshare.com/articles/dataset/R10_4_1_KIT14_comparison_assemblies/24972954Code and analysis outputs are available at:https://gitlab.com/ModernisingMedicalMicrobiology/assembly_comparison
Publisher
Cold Spring Harbor Laboratory
Reference17 articles.
1. Genomics for public health and international surveillance of antimicrobial resistance;The Lancet Microbe,2023
2. De Maio N , Shaw LP , Hubbard A , George S , Sanderson ND , et al. Comparison of long-read sequencing technologies in the hybrid assembly of complex bacterial genomes. Microb Genom;5.
3. Accuracy. Oxford Nanopore Technologies. https://nanoporetech.com/accuracy (accessed 12 January 2024).
4. Comparison of R9.4.1/Kit10 and R10/Kit12 Oxford Nanopore flowcells and chemistries in bacterial genome reconstruction;Microbial Genomics,2023
5. Kirkegaard R. Kirk3gaard/2023-basecalling-benchmarks. https://github.com/Kirk3gaard/2023-basecalling-benchmarks (2023, accessed 19 December 2023).