Succinct indexes for strings, binary relations and multilabeled trees

Author:

Barbay Jérémy1,He Meng2,Munro J. Ian2,Satti Srinivasa Rao3

Affiliation:

1. University of Chile, Chile

2. University of Waterloo, Canada

3. Seoul National University, South Korea

Abstract

We define and design succinct indexes for several abstract data types (ADTs). The concept is to design auxiliary data structures that ideally occupy asymptotically less space than the information-theoretic lower bound on the space required to encode the given data, and support an extended set of operations using the basic operators defined in the ADT. The main advantage of succinct indexes as opposed to succinct (integrated data/index) encodings is that we make assumptions only on the ADT through which the main data is accessed, rather than the way in which the data is encoded. This allows more freedom in the encoding of the main data. In this article, we present succinct indexes for various data types, namely strings, binary relations and multilabeled trees. Given the support for the interface of the ADTs of these data types, we can support various useful operations efficiently by constructing succinct indexes for them. When the operators in the ADTs are supported in constant time, our results are comparable to previous results, while allowing more flexibility in the encoding of the given data. Using our techniques, we design a succinct encoding that represents a string of length n over an alphabet of size σ using n H k ( S ) + lg σ · o ( n ) + O ( n lg σ/lg lg lg σ) bits to support access/rank/select operations in o ((lg lg σ) 1+ϵ ) time, for any fixed constant ϵ > 0. We also design a succinct text index using n H 0 ( S ) + O ( n lg σ/lg lg σ) bits that supports finding all the occ occurrences of a given pattern of length m in O ( m lg lg σ + occ lg n /lg ϵ σ) time, for any fixed constant 0 < ϵ < 1. Previous results on these two problems either have a lg σ factor instead of lg lg σ in the running time, or are not compressed. Finally, we present succinct encodings of binary relations and multi-labeled trees that are more compact than previous structures.

Publisher

Association for Computing Machinery (ACM)

Subject

Mathematics (miscellaneous)

Reference36 articles.

1. Barbay J. 2006. Adaptive search algorithm for patterns in succinctly encoded XML. Tech. rep. CS-2006-11 University of Waterloo Ontario Canada. Barbay J. 2006. Adaptive search algorithm for patterns in succinctly encoded XML. Tech. rep. CS-2006-11 University of Waterloo Ontario Canada.

2. Adaptive searching in succinctly encoded binary relations and tree-structured documents

3. Representing Trees of Higher Degree

Cited by 38 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Rank and Select on Degenerate Strings;2024 Data Compression Conference (DCC);2024-03-19

2. Block trees;Journal of Computer and System Sciences;2021-05

3. On the Memory Requirement of Hop-by-Hop Routing: Tight Bounds and Optimal Address Spaces;IEEE/ACM Transactions on Networking;2020-06

4. Succinct Encodings for Families of Interval Graphs;Algorithmica;2020-04-25

5. The Function-Inversion Problem: Barriers and Opportunities;Theory of Cryptography;2019

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3