Integration of hybrid and self-correction method improves the quality of long-read sequencing data

Author:

Tang Tao1,Liu Yiping2,Zheng Binshuang1,Li Rong1,Zhang Xiaocai3,Liu Yuansheng2

Affiliation:

1. Nanjing University of Posts and Telecommunications School of Mordern Posts, , 9 Wenyuan Rd, Qixia District, 210023, Jiangsu , China

2. Hunan University College of Computer Science and Electronic Engineering, , 2 Lushan S Rd, Yuelu District, 410086, Changsha , China

3. Technology and Research (A*STAR) Institute of High Performance Computing, Agency for Science, , 138632, Singapore , Singapore

Abstract

Abstract Third-generation sequencing (TGS) technologies have revolutionized genome science in the past decade. However, the long-read data produced by TGS platforms suffer from a much higher error rate than that of the previous technologies, thus complicating the downstream analysis. Several error correction tools for long-read data have been developed; these tools can be categorized into hybrid and self-correction tools. So far, these two types of tools are separately investigated, and their interplay remains understudied. Here, we integrate hybrid and self-correction methods for high-quality error correction. Our procedure leverages the inter-similarity between long-read data and high-accuracy information from short reads. We compare the performance of our method and state-of-the-art error correction tools on Escherichia coli and Arabidopsis thaliana datasets. The result shows that the integration approach outperformed the existing error correction methods and holds promise for improving the quality of downstream analyses in genomic research.

Publisher

Oxford University Press (OUP)

Subject

Genetics,Molecular Biology,Biochemistry,General Medicine

Reference50 articles.

1. Pacbio sequencing and its applications;Rhoads;Genomics Proteomics Bioinformatics,2015

2. The oxford nanopore minion: delivery of nanopore sequencing to the genomics community;Jain;Genome Biol,2016

3. Nanopore technology and its applications in gene sequencing;Lin;Biosensors,2021

4. Complete telomere-to-telomere de novo assembly of the plasmodium falciparum genome through long-read (> 11 kb), single molecule, real-time sequencing;Vembar;DNA Res,2016

5. Characterization of minion nanopore data for resequencing analyses;Magi;Brief Bioinform,2017

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3