Affiliation:
1. Shanghai Jiao Tong University, Apex Data 8 Knowledge Management Lab, Shanghai, China
2. JD Finance, Urban Computing Lab, BDA, China
Abstract
Urban anomalies, such as abnormal movements of crowds and accidents, may result in loss of life or property if not handled properly. It would be of great value for governments if anomalies can be automatically alerted in their early stage. However, detecting anomalies in urban area has two main challenges. First, the criteria to determine an anomaly on different occasions (e.g. rainy days vs. sunny days, or holidays vs. workdays) and in different places (e.g. tourist attractions vs. office areas) are distinctly different, as these occasions and places have their own definitions on normal patterns. Second, urban anomalies often exhibit complex forms (e.g. road closure may cause decrease in taxi flow and increase in bike flow). We need an algorithm that not only models the anomaly degree of individual data source but also the combination of changes in multiple data sources. In this paper, we propose a two-step method to tackle those challenges. In the first step, we use a similarity-based algorithm to estimate an anomaly score for each individual data source in each region and time slot based on the values of historically similar regions. Those scores are fed into the second step, where we propose an algorithm based on one-class Support Vector Machine to capture rare patterns occurred in multiple data sources, nearby regions or time slots, and give a final, integrated anomaly score for each region. Evaluations based on both synthetic and real world datasets show the advantages of our method beyond baseline techniques such as distance-based, probability-based methods.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Networks and Communications,Hardware and Architecture,Human-Computer Interaction
Cited by
34 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献