Affiliation:
1. University of Copenhagen, Denmark
Abstract
Randomized algorithms are often enjoyed for their simplicity, but the hash functions employed to yield the desired probabilistic guarantees are often too complicated to be practical. Here, we survey recent results on how simple hashing schemes based on tabulation provide unexpectedly strong guarantees.
Simple tabulation hashing
dates back to Zobrist (A new hashing method with application for game playing. Technical Report 88, Computer Sciences Department, University of Wisconsin). Keys are viewed as consisting of
c
characters and we have precomputed character tables
h
1
, . . .,
h
c
mapping characters to random hash values. A key
x
= (
x
1
, . . .,
x
c
) is hashed to
h
1
[
x
1
] ⊕
h
2
[
x
2
]..... ⊕
h
c
[
x
c
] This schemes is very fast with character tables in cache. Although simple tabulation is not even four-independent, it does provide many of the guarantees that are normally obtained via higher independence, for example, linear probing and Cuckoo hashing.
Next, we consider
twisted tabulation
where one input character is "twisted" in a simple way. The resulting hash function has powerful distributional properties: Chernoff-style tail bounds and a very small bias for minwise hashing. This is also yields an extremely fast pseudorandom number generator that is provably good for many classic randomized algorithms and data-structures.
Finally, we consider
double tabulation
where we compose two simple tabulation functions, applying one to the output of the other, and show that this yields very high independence in the classic framework of Wegman and Carter. In fact, w.h.p., for a given set of size proportional to that of the space consumed, double tabulation gives fully random hashing. We also mention some more elaborate tabulation schemes getting near-optimal independence for given time and space.
Although these tabulation schemes are all easy to implement and use, their analysis is not.
Publisher
Association for Computing Machinery (ACM)
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献