Competitive Data-Structure Dynamization

Author:

Mathieu Claire1ORCID,Rajaraman Rajmohan2ORCID,Young Neal E.3ORCID,Yousefi Arman4ORCID

Affiliation:

1. CNRS, France

2. Northeastern University, USA

3. University of California Riverside, USA

4. Google, USA

Abstract

Data-structure dynamization is a general approach for making static data structures dynamic. It is used extensively in geometric settings and in the guise of so-called merge (or compaction) policies in big-data databases such as LevelDB and Google Bigtable. Previous theoretical work is based on worst-case analyses for uniform inputs – insertions of one item at a time and non-varying read rate. In practice, merge policies must not only handle batch insertions and varying read/write ratios, they can take advantage of such non-uniformity to reduce cost on a per-input basis. To model this, we initiate the study of data-structure dynamization through the lens of competitive analysis, via two new online set-cover problems. For each, the input is a sequence of disjoint sets of weighted items. The sets are revealed one at a time. The algorithm must respond to each with a set cover that covers all items revealed so far. It obtains the cover incrementally from the previous cover by adding one or more sets and optionally removing existing sets. For each new set the algorithm incurs build cost equal to the weight of the items in the set. In the first problem the objective is to minimize total build cost plus total query cost, where the algorithm incurs a query cost at each time t equal to the current cover size. In the second problem, the objective is to minimize the build cost while keeping the query cost from exceeding \(k\) (a given parameter) at any time. We give deterministic online algorithms for both variants, with competitive ratios of \(\Theta(\log^* n)\) and \(k\) , respectively. The latter ratio is optimal for the second variant.

Publisher

Association for Computing Machinery (ACM)

Reference54 articles.

1. Pankaj K. Agarwal, Lars Arge, Octavian Procopiuc, and Jeffrey Scott Vitter. 2001. A Framework for Index Bulk Loading and Dynamization. In Automata, Languages and Programming (Lecture Notes in Computer Science), Fernando Orejas, Paul G. Spirakis, and Jan van Leeuwen (Eds.). Springer Berlin Heidelberg, 115–127.

2. Approximating Extent Measures of Points;Agarwal Pankaj K.;J. ACM,2004

3. Alok Aggarwal, Ashok K. Chandra, and Marc Snir. 1987. Hierarchical Memory with Block Transfer. In 28th Annual Symposium on Foundations of Computer Science. IEEE, 204–216. https://doi.org/10.1109/SFCS.1987.31

4. AsterixDB: A Scalable;Alsubaiee Sattam;Open Source BDMS. Proceedings of the VLDB Endowment,2014

5. External Memory Data Structures

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3