Abstract
Split trees are a new technique for searching sets of keys with highly skewed frequency distributions. A split tree is a binary search tree each node of which contains two key values—a
node
value which is a maximally frequent key in that subtree, and a
split
value which partitions the remaining keys (with respect to their lexical ordering) between the left and right subtrees. A
median
split tree (MST) uses the lexical median of a node's descendents as its split value to force the search tree to be perfectly balanced, achieving both a space efficient representation of the tree and high search speed. Unlike frequency ordered binary search trees, the cost of a successful search of an MST is log
n
bounded and very stable around minimal values. Further, an MST can be built for a given key ordering and set of frequencies in time
n
log
n
, as opposed to
n
2
for an optimum binary search tree. A discussion of the application of MST's to dictionary lookup for English is presented, and the performance obtained is contrasted with that of other techniques.
Publisher
Association for Computing Machinery (ACM)
Cited by
38 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Classification via Two-Way Comparisons (Extended Abstract);Lecture Notes in Computer Science;2023
2. Rakshak;International Journal of Knowledge-Based Organizations;2022-05-20
3. On Huang and Wong’s algorithm for generalized binary split trees;Acta Informatica;2022-02-14
4. On the cost of unsuccessful searches in search trees with two-way comparisons;Information and Computation;2021-12
5. Twenty (simple) questions;Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing;2017-06-19