Author:
da Silva Camilla,Nisenson Jed,Boisvert Jeff
Abstract
AbstractMachine learning algorithms have been increasingly applied to spatial numerical modeling. However, it is important to understand when such methods will underperform. Machine learning algorithms are impacted by dataset shift; when modeling domains of interest present non-stationarities there is no guarantee that the trained models are effective in unsampled areas. This work aims to compare the stationarity requirement of geostatistical methods to the concept of dataset shift. Also, workflow is developed to detect dataset shift in spatial data prior to modeling, this involves applying a discriminative classifier and a two sample Kolmogorv-Smirnov test to model areas. And, when required a lazy learning modification of support vector regression is proposed to account for dataset shift. The benefits of the lazy learning algorithm are demonstrated on the well-known non-stationary Walker Lake dataset and improves root mean squared error up to 25% relative to standard SVR approach, in areas where dataset shift is present.
Publisher
Springer International Publishing
Reference14 articles.
1. Baier, L., Hofmann, M., Kuhl, N., Mohr, M., Satzger, G.: Handling concept drift in regression problems—the error intersection approach. Comput. Sci. Math. (2020). https://doi.org/10.30844/wi_2020_c1-baier
2. Bottou, E., Vapnik, V.: Local learning algorithms. Neural Comput. 4 (1992). https://doi.org/10.1162/neco.1992.4.6.888.
3. Cejnek, M., Bukovsky, I.: Concept drift robust adaptive novelty detection for data streams. Neurocomputing 309 (2018). https://doi.org/10.1016/j.neucom.2018.04.069
4. Dai, F., Zhou, Q., Lv, Z., Wang, X., Liu, G.: Spatial prediction of soil organic matter content integrating artificial neural network and ordinary kriging in Tibetan Plateau. Ecol. Ind. 45 (2014). https://doi.org/10.1016/j.ecolind.2014.04.003.
5. Diethe, T., Borchert, T., Thereska, E., Balle, B., Lawrence, N.: Continual learning in practice. In: 32nd Conference on Neural Information Processing Systems (2018)