Author:
Su Chung-Tsai,Chang Ming-Tai,Cheng Yun-Chian,Li Yun-Lung,Wang Yao-Ting
Abstract
AbstractSummary: De novo genome assembly is an important application on both uncharacterized genome assembly and variant identification in a reference-unbiased way. In comparison with de Brujin graph, string graph is a lossless data representation for de novo assembly. However, string graph construction is computational intensive. We propose GraphSeq to accelerate string graph construction by leveraging the distributed computing framework.Availability and Implementation: GraphSeq is implemented with Scala on Spark and freely available at https://www.atgenomix.com/blog/graphseq.Supplementary information: Supplementary data are available at Bioinformatics online.
Publisher
Cold Spring Harbor Laboratory
Reference5 articles.
1. LSG: An external-memory tool to compute string graphs for NGS data assembly;J. Comput. Biol,2016
2. Bonizzoni, P. et al. FSG: Fast String Graph Construction for De Novo Assembly. JOURNAL OF COMPUTATIONAL BIOLOGY, Volume 24, Number 0, 2017.
3. Gupta, S. et al. SPARK: A high-level synthesis framework for applying parallelizing compiler transformations. VLSI Design, 2003 Proceedings. 16th International Conference on, IEEE.
4. Nothaft, F. A. et al. Rethinking data intensive science using scalable analytics systems. In SIGMOD 2015, pages 631–646.
5. Efficient de novo assembly of large genomes using compressed data structures;Genome Res,2011
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献