HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm-Reference-Cited by-同舟云学术

HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm

Published:2007-01-01 Issue:Proceedings Volume:DMTCS Proceedings vol. AH,... Page:
ISSN:1365-8050
Container-title:Discrete Mathematics & Theoretical Computer Science
language:en
Short-container-title:

Author:

Flajolet Philippe,Fusy Éric,Gandouet Olivier,Meunier Frédéric

Abstract

International audience This extended abstract describes and analyses a near-optimal probabilistic algorithm, HYPERLOGLOG, dedicated to estimating the number of \emphdistinct elements (the cardinality) of very large data ensembles. Using an auxiliary memory of m units (typically, "short bytes''), HYPERLOGLOG performs a single pass over the data and produces an estimate of the cardinality such that the relative accuracy (the standard error) is typically about $1.04/\sqrt{m}$. This improves on the best previously known cardinality estimator, LOGLOG, whose accuracy can be matched by consuming only 64% of the original memory. For instance, the new algorithm makes it possible to estimate cardinalities well beyond $10^9$ with a typical accuracy of 2% while using a memory of only 1.5 kilobytes. The algorithm parallelizes optimally and adapts to the sliding window model.

Publisher

Centre pour la Communication Scientifique Directe (CCSD)

Subject

Discrete Mathematics and Combinatorics,General Computer Science,Theoretical Computer Science

Cited by 141 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. MpScope: Enabling multi-pipeline monitoring inside a switch;Computer Networks;2024-12

2. Multi-source data integration for explainable miRNA-driven drug discovery;Future Generation Computer Systems;2024-11

3. CardSketch: Shift Attention for Network-wide Cardinality Telemetry;2024 IEEE 49th Conference on Local Computer Networks (LCN);2024-10-08

4. QSketch: An Efficient Sketch for Weighted Cardinality Estimation in Streams;Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining;2024-08-24

5. SAROS: A Self-Adaptive Routing Oblivious Sampling Method for Network-wide Heavy Hitter Detection;Proceedings of the 8th Asia-Pacific Workshop on Networking;2024-08-03