Sequencing error profiles of Illumina sequencing instruments

Author:

Stoler Nicholas1,Nekrutenko Anton2ORCID

Affiliation:

1. Graduate Program in Bioinformatics and Genomics, The Huck Institutes for Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA

2. Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA

Abstract

Abstract Sequencing technology has achieved great advances in the past decade. Studies have previously shown the quality of specific instruments in controlled conditions. Here, we developed a method able to retroactively determine the error rate of most public sequencing datasets. To do this, we utilized the overlaps between reads that are a feature of many sequencing libraries. With this method, we surveyed 1943 different datasets from seven different sequencing instruments produced by Illumina. We show that among public datasets, the more expensive platforms like HiSeq and NovaSeq have a lower error rate and less variation. But we also discovered that there is great variation within each platform, with the accuracy of a sequencing experiment depending greatly on the experimenter. We show the importance of sequence context, especially the phenomenon where preceding bases bias the following bases toward the same identity. We also show the difference in patterns of sequence bias between instruments. Contrary to expectations based on the underlying chemistry, HiSeq X Ten and NovaSeq 6000 share notable exceptions to the preceding-base bias. Our results demonstrate the importance of the specific circumstances of every sequencing experiment, and the importance of evaluating the quality of each one.

Funder

NHGRI

NSF ABI Grant

NIAID

Publisher

Oxford University Press (OUP)

Subject

General Medicine

Reference20 articles.

1. Sequence-specific error profile of Illumina sequencers;Nakamura;Nucleic Acids Res.,2011

2. Identification and correction of systematic error in high-throughput sequence data;Meacham;BMC Bioinformatics,2011

3. Molecular Diagnostics

4. Illumina 2 colour chemistry can overcall high confidence G bases;Andrews;QC Fail,2016

5. Analysis of error profiles in deep next-generation sequencing data;Ma;Genome Biol.,2019

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3