Direct Pacbio sequencing methods and applications for different types of DNA sequences

Author:

Wang Yusha,Ma Xiaoshu,Yang Lei,Ye Hua,Jia Ruikai

Abstract

AbstractThe development of Sanger sequencing and next-generation sequencing methods within the past few years have assisted investigators profile the diversity and relative abundances of heterogenous species in vector preparations. Especially Recombinant adeno-associated viruses (rAAVs), genome editing, and mRNA related research are currently the most prominently investigated platform in different area and essentially use for synthetic biology, gene and cell therapy, food industrial and medicinal pharmer etc. area. However, these types of research related constructs always contain high GC sequences, poly structure, long-length DNA sequences and ITR repeats sequences.Unfortunately, Sanger sequencing and NGS platforms may be inaccessible to investigators with limited resources, require large amounts of input material, or may require long wait times for sequencing and analyses. Recent advances with PacBio sequencing have helped to bridge the gap for quick and relatively inexpensive long-read sequencing needs. Specifically, long-read sequencing methods, like single molecule real-time (SMRT) sequencing, have been used to uncover truncations, chimeric genomes, and inverted terminal repeat (ITR) mutations in vectors. Recombinant adeno-associated virus (raav) is the most prominent platform in the field of current research, and its sequence is characterized by high GC, multi-structure, long sequence, genome, and repeat sequence. Sanger sequencing has certain defects in the detection of recombinant adeno-associated viruses. Meanwhile, Sanger needs to design sequencing primers based on known sequences to determine whether the sequences are correct. When sequence information is incomplete, it can only randomly design primers, obtain a sequence by luck, and then conduct the next round of sequencing. However, PacBio’s limitations and sample biases are not well-defined for sequencing. And sometimes the accuracy for base calling was low, resulting in a high degree of miscalled bases and false indels. These false indels led to read-length compression; thus, assessing heterogeneity based on read length is not advisable with current PacBio technologies. In this study, we explored the capacity for PacBio sequencing to directly interrogate content to obtain full-length resolution of encapsulated genomes. We found that the PacBio platform can cover the entirety of different type sequences like poly structure, long-length DNA fragment, high GC sequences and repeat sequences, especially the rAAV sequences from ITR to ITR without the need for pre-fragmentation. At the same time, the sequencing process was optimized to complete the sequencing of long difficult plasmids with the fewest plasmids and the fastest time. In summary, the optimization PacBio sequencing and novel bioinformation (BI) analysis method are able to correctly identify truncation hotspots in single-strand and self-complementary vectors using by SMRT sequencing and can serve as a rapid and low-cost alternative for proofing different type of sequences.

Publisher

Cold Spring Harbor Laboratory

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3