Comparison of long-read sequencing technologies in interrogating bacteria and fly genomes

Author:

Tvedte Eric S1,Gasser Mark1,Sparklin Benjamin C1,Michalski Jane12,Hjelmen Carl E3ORCID,Johnston J Spencer4,Zhao Xuechu1,Bromley Robin1,Tallon Luke J1,Sadzewicz Lisa1,Rasko David A12,Dunning Hotopp Julie C125

Affiliation:

1. Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA

2. Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA

3. Department of Biology, Texas A&M University, College Station, TX 77843, USA

4. Department of Entomology, Texas A&M University, College Station, TX 77843, USA

5. Greenebaum Cancer Center, University of Maryland School of Medicine, Baltimore, MD 21201, USA

Abstract

Abstract The newest generation of DNA sequencing technology is highlighted by the ability to generate sequence reads hundreds of kilobases in length. Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) have pioneered competitive long read platforms, with more recent work focused on improving sequencing throughput and per-base accuracy. We used whole-genome sequencing data produced by three PacBio protocols (Sequel II CLR, Sequel II HiFi, RS II) and two ONT protocols (Rapid Sequencing and Ligation Sequencing) to compare assemblies of the bacteria Escherichia coli and the fruit fly Drosophila ananassae. In both organisms tested, Sequel II assemblies had the highest consensus accuracy, even after accounting for differences in sequencing throughput. ONT and PacBio CLR had the longest reads sequenced compared to PacBio RS II and HiFi, and genome contiguity was highest when assembling these datasets. ONT Rapid Sequencing libraries had the fewest chimeric reads in addition to superior quantification of E. coli plasmids versus ligation-based libraries. The quality of assemblies can be enhanced by adopting hybrid approaches using Illumina libraries for bacterial genome assembly or polishing eukaryotic genome assemblies, and an ONT-Illumina hybrid approach would be more cost-effective for many users. Genome-wide DNA methylation could be detected using both technologies, however ONT libraries enabled the identification of a broader range of known E. coli methyltransferase recognition motifs in addition to undocumented D. ananassae motifs. The ideal choice of long read technology may depend on several factors including the question or hypothesis under examination. No single technology outperformed others in all metrics examined.

Funder

National Institute of Allergy and Infectious Diseases

National Institutes of Health

Department of Health and Human Services

National Institutes of Health Director’s Transformative Research Award

Publisher

Oxford University Press (OUP)

Subject

Genetics (clinical),Genetics,Molecular Biology

Reference93 articles.

1. One fly–one genome: chromosome-scale genome assembly of a single outbred Drosophila melanogaster;Adams;Nucleic Acids Res,2020

2. Basic local alignment search tool;Altschul;J Mol Biol,1990

3. Opportunities and challenges in long-read sequencing data analysis;Amarasinghe;Genome Biol,2020

4. Single molecule real-time (SMRT) sequencing comes of age: applications and utilities for medical diagnostics;Ardui;Nucleic Acids Res,2018

5. The MEME suite;Bailey;Nucleic Acids Res,2015

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3