Theory and practice of monotone minimal perfect hashing-Reference-Cited by-同舟云学术

Theory and practice of monotone minimal perfect hashing

Published:2011-05 Issue: Volume:16 Page:
ISSN:1084-6654
Container-title:ACM Journal of Experimental Algorithmics
language:en
Short-container-title:ACM J. Exp. Algorithmics

Author:

Belazzougui Djamal¹,Boldi Paolo²,Pagh Rasmus³,Vigna Sebastiano²

Affiliation:

1. Université Paris Diderot--Paris 77, France

2. Università degli Studi di Milano, Italy

3. IT University of Copenhagen, Denmark

Abstract

Minimal perfect hash functions have been shown to be useful to compress data in several data management tasks. In particular, order-preserving minimal perfect hash functions (Fox et al. 1991) have been used to retrieve the position of a key in a given list of keys; however, the ability to preserve any given order leads to an unavoidable Ω( n log n ) lower bound on the number of bits required to store the function. Recently, it was observed (Belazzougui et al. 2009) that very frequently the keys to be hashed are sorted in their intrinsic (i.e., lexicographical) order. This is typically the case of dictionaries of search engines, list of URLs of Web graphs, and so on. We refer to this restricted version of the problem as monotone minimal perfect hashing . We analyze experimentally the data structures proposed in Belazzougui et al. [2009], and along our way we propose some new methods that, albeit asymptotically equivalent or worse, perform very well in practice and provide a balance between access speed, ease of construction, and space usage.

Publisher

Association for Computing Machinery (ACM)

Subject

Theoretical Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/1963190.2025378

Reference32 articles.

1. UbiCrawler: a scalable fully distributed Web crawler

2. A large time-aware web graph

Cited by 10 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Learned Approach to Design Compressed Rank/Select Data Structures;ACM Transactions on Algorithms;2022-07-31

2. Compressing and Querying Integer Dictionaries Under Linearities and Repetitions;IEEE Access;2022

3. A Learned Prefix Bloom Filter for Spatial Data;Lecture Notes in Computer Science;2022

4. A Dynamic Repository Approach for Small File Management With Fast Access Time on Hadoop Cluster: Hash Based Extended Hadoop Archive;IEEE Access;2022

5. On Representing the Degree Sequences of Sublogarithmic-Degree Wheeler Graphs;String Processing and Information Retrieval;2022