Mapinsights: deep exploration of quality issues and error profiles in high-throughput sequence data

Author:

Das Subrata1ORCID,Biswas Nidhan K1ORCID,Basu Analabha1ORCID

Affiliation:

1. National Institute of Biomedical Genomics ,  Kalyani , 741251, West Bengal,  India

Abstract

Abstract High-throughput sequencing (HTS) has revolutionized science by enabling super-fast detection of genomic variants at base-pair resolution. Consequently, it poses the challenging problem of identification of technical artifacts, i.e. hidden non-random error patterns. Understanding the properties of sequencing artifacts holds the key in separating true variants from false positives. Here, we develop Mapinsights, a toolkit that performs quality control (QC) analysis of sequence alignment files, capable of detecting outliers based on sequencing artifacts of HTS data at a deeper resolution compared with existing methods. Mapinsights performs a cluster analysis based on novel and existing QC features derived from the sequence alignment for outlier detection. We applied Mapinsights on community standard open-source datasets and identified various quality issues including technical errors related to sequencing cycles, sequencing chemistry, sequencing libraries and across various orthogonal sequencing platforms. Mapinsights also enables identification of anomalies related to sequencing depth. A logistic regression-based model built on the features of Mapinsights shows high accuracy in detecting ‘low-confidence’ variant sites. Quantitative estimates and probabilistic arguments provided by Mapinsights can be utilized in identifying errors, bias and outlier samples, and also aid in improving the authenticity of variant calls.

Funder

Ministry of Electronics and Information Technology

Department of Biotechnology

Publisher

Oxford University Press (OUP)

Subject

Genetics

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3