Affiliation:
1. University of Minnesota, USA
2. Qatar Computing Research Institute and University of Minnesota, USA
Abstract
Autologistic regression is one of the most popular statistical tools to predict spatial phenomena in several applications, including epidemic diseases detection, species occurrence prediction, earth observation, and business management. In general, autologistic regression divides the space into a two-dimensional grid, where the prediction is performed at each cell in the grid. The prediction at any location is based on a set of predictors (i.e., features) at this location and predictions from neighboring locations. In this article, we address the problem of building efficient autologistic models with multinomial (i.e., categorical) prediction and predictor variables, where the categories represented by these variables are unordered. Unfortunately, existing methods to build autologistic models are designed for binary variables in addition to being computationally expensive (i.e., do not scale up for large-scale grid data such as fine-grained satellite images). Therefore, we introduce
RegRocket
: a scalable framework to build multinomial autologistic models for predicting large-scale spatial phenomena.
RegRocket
considers both the accuracy and efficiency aspects when learning the regression model parameters. To this end,
RegRocket
is built on top of Markov Logic Network (MLN), a scalable statistical learning framework, where its internals and data structures are optimized to process spatial data.
RegRocket
provides an equivalent representation of the multinomial prediction and predictor variables using MLN where the dependencies between these variables are transformed into first-order logic predicates. Then,
RegRocket
employs an efficient framework that learns the model parameters from the MLN representation in a distributed manner. Extensive experimental results based on two large real datasets show that
RegRocket
can build multinomial autologistic models, in minutes, for 1 million grid cells with 0.85 average F1-score.
Funder
National Science Foundation, USA
Publisher
Association for Computing Machinery (ACM)
Subject
Discrete Mathematics and Combinatorics,Geometry and Topology,Computer Science Applications,Modelling and Simulation,Information Systems,Signal Processing
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Flash;SIGSPATIAL Special;2020-02-13