Affiliation:
1. AT8T Labs-Research, Bedminster, NJ, USA
Abstract
Spatiotemporal streams are prone to data quality issues such as missing, duplicated and delayed data—when data generating sensors malfunction, data transmissions experience problems, or when data are stored or processed improperly. However, many important real-time applications rely on the continuous availability of stream values, e.g., to monitor traffic flow, resource usage, weather phenomena, and so on. Other non real-time applications that support continuous or offline historical analytics also require high quality data to avoid producing misleading output such as false positives, erroneous conclusions, and decisions.
In this article, we study the problem of smoothing streams produced by an overlay of sensors. We present nonparametric (data-driven, distribution free) statistical methods to provide an uninterrupted stream of high-quality spatiotemporal data to real-time applications, even when the raw stream suffers data quality issues, such as noise or missing values. Our novel family of
robust methods
computes
smoothed values
(SVs) that could be used as proxies for data of questionable quality. The methods make use of a partition of the monitored area into cells to compute SVs based on historical data and the deviation from normalcy in neighboring spatial cells in a way that outperforms standard regression or interpolation. Our methods use incremental computation for efficiency, and they differ in how the deviations are normalized, e.g., with respect to zeroth-order, first-order, and second-order moments. We use three real data sets to run a suite of experiments and empirically demonstrate the superiority of the method that uses normalization with respect to variability.
Publisher
Association for Computing Machinery (ACM)
Subject
Discrete Mathematics and Combinatorics,Geometry and Topology,Computer Science Applications,Modeling and Simulation,Information Systems,Signal Processing
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献