Affiliation:
1. University of Chile, Chile
2. National Institute of Informatics, Japan
Abstract
We propose new succinct representations of ordinal trees and match various space/time lower bounds. It is known that any
n
-node static tree can be represented in 2
n
+
o
(
n
) bits so that a number of operations on the tree can be supported in constant time under the word-RAM model. However, the data structures are complicated and difficult to dynamize. We propose a simple and flexible data structure, called the
range min-max tree
, that reduces the large number of relevant tree operations considered in the literature to a few primitives that are carried out in constant time on polylog-sized trees. The result is extended to trees of arbitrary size, retaining constant time and reaching 2
n
+
O
(
n
/polylog(
n
)) bits of space. This space is optimal for a core subset of the operations supported and significantly lower than in any previous proposal.
For the dynamic case, where insertion/deletion (indels) of nodes is allowed, the existing data structures support a very limited set of operations. Our data structure builds on the range min-max tree to achieve 2
n
+
O
(
n
/log
n
) bits of space and
O
(log
n
) time for all operations supported in the static scenario, plus indels. We also propose an improved data structure using 2
n
+
O
(
n
log log
n
/log
n
) bits and improving the time to the optimal
O
(log
n
/log log
n
) for most operations. We extend our support to forests, where whole subtrees can be attached to or detached from others, in time
O
(log
1+ϵ
n
) for any ϵ > 0. Such operations had not been considered before.
Our techniques are of independent interest. An immediate derivation yields an improved solution to range minimum/maximum queries where consecutive elements differ by ± 1, achieving
n
+
O
(
n
/polylog(
n
)) bits of space. A second one stores an array of numbers supporting operations
sum
and
search
and limited updates, in optimal time
O
(log
n
/log log
n
). A third one allows representing dynamic bitmaps and sequences over alphabets of size σ, supporting rank/select and indels, within zero-order entropy bounds and time
O
(log
n
log σ/(log log
n
)
2
) for all operations. This time is the optimal
O
(log
n
/log log
n
) on bitmaps and polylog-sized alphabets. This improves upon the best existing bounds for entropy-bounded storage of dynamic sequences, compressed full-text self-indexes, and compressed-space construction of the Burrows-Wheeler transform.
Funder
Millennium Nucleus Information and Coordination in Networks ICM/FICP10-024F
Japan Society for the Promotion of Science
Publisher
Association for Computing Machinery (ACM)
Subject
Mathematics (miscellaneous)
Cited by
111 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Space‐efficient data structures for the inference of subsumption and disjointness relations;Software: Practice and Experience;2024-09-03
2. Accelerating range minimum queries with ray tracing cores;Future Generation Computer Systems;2024-08
3. Space & Time Efficient Leapfrog Triejoin;Proceedings of the 7th Joint Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA);2024-06-09
4. Space-Efficient Data Structures for Polyominoes and Bar Graphs;2024 Data Compression Conference (DCC);2024-03-19
5. Rank and Select on Degenerate Strings;2024 Data Compression Conference (DCC);2024-03-19