Affiliation:
1. Duke University, Durham, NC
2. MADALGO, University of Aarhus, Denmark
3. Hong Kong University of Science and Technology, Hong Kong, China
Abstract
In this article we present an I/O-efficient algorithm for the batched (off-line) version of the union-find problem. Given any sequence of
N
union and find operations, where each union operation joins two distinct sets, our algorithm uses
O
(SORT(
N
)) =
O
(
N
/
B
log
M/B
N
/
B
) I/Os, where
M
is the memory size and
B
is the disk block size. This bound is asymptotically optimal in the worst case. If there are union operations that join a set with itself, our algorithm uses
O
(SORT(
N
) + MST(
N
)) I/Os, where MST(
N
) is the number of I/Os needed to compute the minimum spanning tree of a graph with
N
edges. We also describe a simple and practical
O
(SORT(
N
) log(
N
/
M
))-I/O algorithm for this problem, which we have implemented.
We are interested in the union-find problem because of its applications in terrain analysis. A terrain can be abstracted as a height function defined over R
2
, and many problems that deal with such functions require a union-find data structure. With the emergence of modern mapping technologies, huge amount of elevation data is being generated that is too large to fit in memory, thus I/O-efficient algorithms are needed to process this data efficiently. In this article, we study two terrain-analysis problems that benefit from a union-find data structure: (i) computing topological persistence and (ii) constructing the contour tree. We give the first
O
(SORT(
N
))-I/O algorithms for these two problems, assuming that the input terrain is represented as a triangular mesh with
N
vertices.
Funder
Research Grants Council, University Grants Committee, Hong Kong
Army Research Office
National Science Foundation
Division of Environmental Biology
Publisher
Association for Computing Machinery (ACM)
Subject
Mathematics (miscellaneous)
Cited by
14 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献