AllSome Sequence Bloom Trees-Reference-Cited by-同舟云学术

AllSome Sequence Bloom Trees

Published:2016-12-02 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Sun Chen,Harris Robert S.,Chikhi Rayan,Medvedev Paul

Abstract

AbstractThe ubiquity of next generation sequencing has transformed the size and nature of many databases, pushing the boundaries of current indexing and searching methods. One particular example is a database of 2,652 human RNA-seq experiments uploaded to the Sequence Read Archive. Recently, Solomon and Kingsford proposed the Sequence Bloom Tree data structure and demonstrated how it can be used to accurately identify SRA samples that have a transcript of interest potentially expressed. In this paper, we propose an improvement called the AllSome Sequence Bloom Tree. Results show that our new data structure significantly improves performance, reducing the tree construction time by 52.7% and query time by 39 - 85%, with a price of up to 3x memory consumption during queries. Notably, it can query a batch of 198,074 queries in under 8 hours (compared to around two days previously) and a whole set of k-mers from a sequencing experiment (about 27 mil k-mers) in under 11 minutes.

Publisher

Cold Spring Harbor Laboratory

Reference39 articles.

1. SBT-SK software and data. http://www.cs.cmu.edu/%7Eckingsf/software/bloomtree/, Accessed: 2016-07-01

2. Baier, U. , Beller, T. , Ohlebusch, E. : Graphical pan-genome analysis with compressed suffix trees and the Burrows–Wheeler transform. Bioinformatics p. btv603 (2015)

3. Space/time trade-offs in hash coding with allowable errors

4. Near-optimal probabilistic RNA-seq quantification

5. Better bitmap performance with roaring bitmaps;Software: practice and experience,2015

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Hierarchical Interleaved Bloom Filter: enabling ultrafast, approximate sequence queries;Genome Biology;2023-05-31

2. Hierarchical Interleaved Bloom Filter: Enabling ultrafast, approximate sequence queries;2022-08-01

3. Co-Design for Energy Efficient and Fast Genomic Search;Proceedings of the 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays;2022-02-11

4. Raptor: A fast and space-efficient pre-filter for querying very large collections of nucleotide sequences;2020-10-08