Abstract
Abstract
Background
Despite constantly improving genome sequencing methods, error-free eukaryotic genome assembly has not yet been achieved. Among other kinds of problems of eukaryotic genome assembly are so-called "haplotypic duplications", which may manifest themselves as cases of alleles being mistakenly assembled as paralogues. Haplotypic duplications are dangerous because they create illusions of gene family expansions and, thus, may lead scientists to incorrect conclusions about genome evolution and functioning.
Results
Here, I present Mabs, a suite of tools that serve as parameter optimizers of the popular genome assemblers Hifiasm and Flye. By optimizing the parameters of Hifiasm and Flye, Mabs tries to create genome assemblies with the genes assembled as accurately as possible. Tests on 6 eukaryotic genomes showed that in 6 out of 6 cases, Mabs created assemblies with more accurately assembled genes than those generated by Hifiasm and Flye when they were run with default parameters. When assemblies of Mabs, Hifiasm and Flye were postprocessed by a popular tool for haplotypic duplication removal, Purge_dups, genes were better assembled by Mabs in 5 out of 6 cases.
Conclusions
Mabs is useful for making high-quality genome assemblies. It is available at https://github.com/shelkmike/Mabs
Funder
Russian Science Foundation
Ministry of Science and Higher Education of the Russian Federation
Publisher
Springer Science and Business Media LLC
Subject
Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology
Reference73 articles.
1. Wang Y, Zhao Y, Bollas A, Wang Y, Au KF. Nanopore sequencing technology, bioinformatics and applications. Nat Biotechnol. 2021;39:1348–65.
2. Pacific Biosciences. Sequel II System 2.0 Chemistry and v8.0 Software Release (https://www.pacb.com/technology/hifi-sequencing/sequel-system/previous-system-releases/). 2019. https://www.pacb.com/technology/hifi-sequencing/sequel-system/previous-system-releases/. Accessed 29 Sep 2022.
3. Pacific Biosciences. Pacific Biosciences Launches New HiFi Sequencing Workflow to Further Improve HiFi’s Industry Leading Accuracy (https://www.pacb.com/press_releases/pacific-biosciences-launches-new-hifi-sequencing-workflow-to-further-improve-hifis-industry-leading-accuracy/). 2021. https://www.pacb.com/press_releases/pacific-biosciences-launches-new-hifi-sequencing-workflow-to-further-improve-hifis-industry-leading-accuracy/. Accessed 29 Sep 2022.
4. Dida F, Yi G. Empirical evaluation of methods for de novo genome assembly. PeerJ Comput Sci. 2021;7:e636.
5. Nurk S, Koren S, Rhie A, Rautiainen M, Bzikadze AV, Mikheenko A, et al. The complete sequence of a human genome. Science. 2022;376:44–53.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献