Abstract
AbstractThe accurate detection of variants is essential for genomics-based studies. Currently, there are various tools designed to detect genomic variants, however, it has always been a challenge to decide which tool to use, especially when various major genome projects have chosen to use different tools. Thus far, most of the existing tools were mainly developed to work on short-read data (i.e., Illumina); however, other sequencing technologies (e.g. PacBio, and Oxford Nanopore) have recently shown that they can also be used for variant calling. In addition, with the emergence of artificial intelligence (AI)-based variant calling tools, there is a pressing need to compare these tools in terms of efficiency, accuracy, computational power, and ease of use. In this study, we evaluated the most widely used conventional and AI-based variant calling tools (BCFTools, GATK4, Platypus, DNAscope, and DeepVariant) in terms of accuracy and computational cost using both short-read and long-read data derived from three different sequencing technologies for the same set of samples from the Genome In A Bottle (GIAB) project. The analysis showed that AI-based variant calling tools supersede conventional ones for calling SNVs and INDELs using both long and short reads. In addition, we demonstrate the advantages and drawbacks of each tool while ranking them in each aspect of these comparisons. This study provides best practices for variant calling using AI-based and conventional variant callers with different types of sequencing data.
Publisher
Cold Spring Harbor Laboratory
Reference52 articles.
1. Human genetic variation and its contribution to complex traits
2. Best practices for variant calling in clinical sequencing
3. A review of somatic single nucleotide variant calling algorithms for next-generation sequencing data
4. The importance of genomic variation for biodiversity, ecosystems and people;Nat Rev Genet,2021
5. Sawyer SD , Mitchell G , Mckinley J. A Role for Common Genomic Variants in the Assessment of Familial Breast Cancer 5-Fluorouracil predictive test View project Psychosocial and behavioural impact of genomic testing for polygenic breast cancer risk View project. Article in Journal of Clinical Oncology 2012;