Author:
Lee Dong-Jun,Kwon Taesoo,Lee Hye-Jin,Oh Yun-Ho,Kim Jin-Hyun,Lee Tae-Ho
Abstract
AbstractNext-generation sequencing (NGS) is widely used in all areas of genetic research, such as for genetic disease diagnosis and breeding, and it can produce massive amounts of data. The identification of sequence variants is an important step when processing large NGS datasets; however, currently, the process is complicated, repetitive, and requires concentration, which can be taxing on the researcher. Therefore, to support researchers who are not familiar with bioinformatics in identifying sequence variations regularly from large datasets, we have developed a fully automated desktop software, NGSpop. NGSpop includes functionalities for all the variant calling and visualization procedures used when processing NGS data, such as quality control, mapping, filtering details, and variant calling. In the variant calling step, the user can select the GATK or DeepVariant algorithm for variant calling. These algorithms can be executed using pre-set pipelines and options or customized with the user-specified options. NGSpop is implemented using JavaFX (version 1.8) and can thus be run on Unix like operating systems such as Ubuntu Linux (version 16.04, 18.0.4). Although there are several pipelines and visualization tools available for NGS data analysis, most integrated environments do not support batch processes; thus, variant detection cannot be automated for population-level studies. The NGSpop software, developed in this study, has an easy-to-use interface and helps in rapid analysis of multiple NGS data from population studies.
Publisher
Cold Spring Harbor Laboratory