Affiliation:
1. Univ. di Pisa, Pisa, Italy
2. Univ. di Firenze, Florence, Italy
Abstract
We introduce a new text-indexing data structure, the
String B-Tree
, that can be seen as a link between some traditional external-memory and string-matching data structures. In a short phrase, it is a combination of B-trees and Patricia tries for internal-node indices that is made more effective by adding extra pointers to speed up search and update operations. Consequently, the String B-Tree overcomes the theoretical limitations of inverted files, B-trees, prefix B-trees, suffix arrays, compacted tries and suffix trees. String B-trees have the same worst-case performance as B-trees but they manage unbounded-length strings and perform much more powerful search operations such as the ones supported by suffix trees. String B-trees are also effective in main memory (RAM model) because they improve the online suffix tree search on a dynamic set of strings. They also can be successfully applied to database indexing and software duplication.
Publisher
Association for Computing Machinery (ACM)
Subject
Artificial Intelligence,Hardware and Architecture,Information Systems,Control and Systems Engineering,Software
Reference53 articles.
1. The input/output complexity of sorting and related problems
2. AHO A. V. HOPCROFT J. E. AND ULLMAN J. D. 1974. The Design and Analysis of Computer Algorithms. Addison-Wesley Reading Mass. AHO A. V. HOPCROFT J. E. AND ULLMAN J. D. 1974. The Design and Analysis of Computer Algorithms. Addison-Wesley Reading Mass.
3. Hash functions for priority queues
4. Dynamic dictionary matching
5. Alphabet dependence in parameterized matching
Cited by
181 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Brief Announcement: Scalable Distributed String Sorting;Proceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures;2024-06-17
2. Finding maximal exact matches in graphs;Algorithms for Molecular Biology;2024-03-11
3. CoCo-trie: Data-aware compression and indexing of strings;Information Systems;2024-02
4. Top-k query optimization on the hierarchical memory structure;2023 IEEE 6th International Conference on Automation, Electronics and Electrical Engineering (AUTEEE);2023-12-15
5. Coriolis: enabling metagenomic classification on lightweight mobile devices;Bioinformatics;2023-06-01