Affiliation:
1. University of Rochester
2. Carnegie Mellon University
3. Georgia Inst. of Technology
Abstract
Using entropy of traffic distributions has been shown to aid a wide variety of network monitoring applications such as anomaly detection, clustering to reveal interesting patterns, and traffic classification. However, realizing this potential benefit in practice requires accurate algorithms that can operate on high-speed links, with low CPU and memory requirements. In this paper, we investigate the problem of estimating the entropy in a streaming computation model. We give lower bounds for this problem, showing that neither approximation nor randomization alone will let us compute the entropy efficiently. We present two algorithms for randomly approximating the entropy in a time and space efficient manner, applicable for use on very high speed (greater than OC-48) links. The first algorithm for entropy estimation is inspired by the structural similarity with the seminal work of Alon et al. for estimating frequency moments, and we provide strong theoretical guarantees on the error and resource usage. Our second algorithm utilizes the observation that the performance of the streaming algorithm can be enhanced by separating the high-frequency items (or elephants) from the low-frequency items (or mice). We evaluate our algorithms on traffic traces from different deployment scenarios.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Networks and Communications,Hardware and Architecture,Software
Cited by
54 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献