Figbird: a probabilistic method for filling gaps in genome assemblies
Author:
Tarafder Sumit12,
Islam Mazharul12,
Shatabda Swakkhar2ORCID,
Rahman Atif1ORCID
Affiliation:
1. Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology , Dhaka 1205, Bangladesh
2. Department of Computer Science and Engineering, United International University , Dhaka 1212, Bangladesh
Abstract
Abstract
Motivation
Advances in sequencing technologies have led to the sequencing of genomes of a multitude of organisms. However, draft genomes of many of these organisms contain a large number of gaps due to the repeats in genomes, low sequencing coverage and limitations in sequencing technologies. Although there exists several tools for filling gaps, many of these do not utilize all information relevant to gap filling.
Results
Here, we present a probabilistic method for filling gaps in draft genome assemblies using second-generation reads based on a generative model for sequencing that takes into account information on insert sizes and sequencing errors. Our method is based on the expectation-maximization algorithm unlike the graph-based methods adopted in the literature. Experiments on real biological datasets show that this novel approach can fill up large portions of gaps with small number of errors and misassemblies compared to other state-of-the-art gap-filling tools.
Availability and implementation
The method is implemented using C++ in a software named ‘Filling Gaps by Iterative Read Distribution (Figbird)’, which is available at https://github.com/SumitTarafder/Figbird.
Supplementary information
Supplementary data are available at Bioinformatics online.
Funder
Institute of Advanced Research (IAR) of United International University
Publisher
Oxford University Press (OUP)
Subject
Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Utilizing Deep Neural Networks to Fill Gaps in Small Genomes;International Journal of Molecular Sciences;2024-08-04
2. HRGF-GapCloser: A gap filling method base on HiFi read and read clustering;Proceedings of the 2024 4th International Conference on Bioinformatics and Intelligent Computing;2024-01-26