Affiliation:
1. Georgia Institute of Technology
2. AT&T Labs--Research
Abstract
Estimation of traffic matrices, which provide critical input for network capacity planning and traffic engineering, has recently been recognized as an important research problem. Most of the previous approaches infer traffic matrix from either SNMP link loads or sampled NetFlow records. In this work, we design novel inference techniques that, by statistically correlating SNMP link loads and sampled NetFlow records, allow for much more accurate estimation of traffic matrices than obtainable from either information source alone, even when sampled NetFlow records are available at only a subset of ingress. Our techniques are practically important and useful since both SNMP and NetFlow are now widely supported by vendors and deployed in most of the operational IP networks. More importantly, this research leads us to a new insight that SNMP link loads and sampled NetFlow records can serve as "error correction codes" to each other. This insight helps us to solve a challenging open problem in traffic matrix estimation, "How to deal with dirty data (SNMP and NetFlow measurement errors due to hardware/software/transmission problems)?" We design techniques that, by comparing notes between the above two information sources, identify and remove dirty data, and therefore allow for accurate estimation of the traffic matrices with the cleaned dat.We conducted experiments on real measurement data obtained from a large tier-1 ISP backbone network. We show that, when full deployment of NetFlow is not available, our algorithm can improve estimation accuracy significantly even with a small fraction of NetFlow data. More importantly, we show that dirty data can contaminate a traffic matrix, and identifying and removing them can reduce errors in traffic matrix estimation by up to an order of magnitude. Routing changes is another a key factor that affects estimation accuracy. We show that using them as the a priori, the traffic matrices can be estimated much more accurately than those omitting the routing change. To the best of our knowledge, this work is the first to offer a comprehensive solution which fully takes advantage of using multiple readily available data sources. Our results provide valuable insights on the effectiveness of combining flow measurement and link load measurement.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Networks and Communications,Hardware and Architecture,Software
Reference21 articles.
1. D.L. Donoho. For most large underdetermined systems of equations the minimal l1-norm near solution approximates the sparsest near-solution. In http://www-stat.stanford.edu/~donoho/Reports/ 2004. D.L. Donoho. For most large underdetermined systems of equations the minimal l1-norm near solution approximates the sparsest near-solution. In http://www-stat.stanford.edu/~donoho/Reports/ 2004.
2. Predicting resource usage and estimation accuracy in an IP flow measurement collection infrastructure
3. Deriving traffic demands for operational IP networks: methodology and experience
Cited by
29 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献