Burst tries-Reference-Cited by-同舟云学术

Burst tries

Published:2002-04 Issue:2 Volume:20 Page:192-223
ISSN:1046-8188
Container-title:ACM Transactions on Information Systems
language:en
Short-container-title:ACM Trans. Inf. Syst.

Author:

Heinz Steffen¹,Zobel Justin¹,Williams Hugh E.¹

Affiliation:

1. RMIT University, Melbourne, Victoria, Australia

Abstract

Many applications depend on efficient management of large sets of distinct strings in memory. For example, during index construction for text databases a record is held for each distinct word in the text, containing the word itself and information such as counters. We propose a new data structure, the burst trie, that has significant advantages over existing options for such applications: it uses about the same memory as a binary search tree; it is as fast as a trie; and, while not as fast as a hash table, a burst trie maintains the strings in sorted or near-sorted order. In this paper we describe burst tries and explore the parameters that govern their performance. We experimentally determine good choices of parameters, and compare burst tries to other structures used for the same task, with a variety of data sets. These experiments show that the burst trie is particularly effective for the skewed frequency distributions common in text collections, and dramatically outperforms all other data structures for the task of managing strings while maintaining sort order.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Science Applications,General Business, Management and Accounting,Information Systems

Link

https://dl.acm.org/doi/pdf/10.1145/506309.506312

Reference56 articles.

1. Aho A. V. Hopcroft J. E. and Ullman J. D. 1983. Data Structures and Algorithms. Addison-Wesley Reading Massachusetts. Aho A. V. Hopcroft J. E. and Ullman J. D. 1983. Data Structures and Algorithms. Addison-Wesley Reading Massachusetts.

2. Aho A. V. Sethi R. and Ullman J. D. 1986. Compilers Principle Techniques and Tools. Addison-Wesley Reading Massachusetts. Aho A. V. Sethi R. and Ullman J. D. 1986. Compilers Principle Techniques and Tools. Addison-Wesley Reading Massachusetts.

3. Algorithms for trie compaction

4. Improved behaviour of tries by adaptive branching

5. An efficient implementation of trie structures

Cited by 98 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Accelerating String-Key Learned Index Structures via Memoization-Based Incremental Training;Proceedings of the VLDB Endowment;2024-04

2. CoCo-trie: Data-aware compression and indexing of strings;Information Systems;2024-02

3. Methods for Pangenomic Core Detection;Methods in Molecular Biology;2024

4. Catalyst: Optimizing Cache Management for Large In-memory Key-value Systems;Proceedings of the VLDB Endowment;2023-09

5. A distributed B+Tree indexing method for processing range queries over streaming data;Cluster Computing;2023-05-07