Abstract
Abstract. Errors, gaps, and outliers complicate and sometimes
invalidate the analysis of time series. While most fields have developed
their own strategy to clean the raw data, no generic procedure has been
promoted to standardize the pre-processing. This lack of harmonization makes
the inter-comparison of studies difficult, and leads to screening methods
that can be arbitrary or case-specific. This study provides a generic
pre-processing procedure implemented in R (ctbi for cyclic/trend
decomposition using bin interpolation) dedicated to univariate time series.
Ctbi is based on data binning and decomposes the time series into a
long-term trend and a cyclic component (quantified by a new metric, the
Stacked Cycles Index) to finally aggregate the data. Outliers are flagged
with an enhanced box plot rule called Logbox that corrects biases due to the
sample size and that is adapted to non-Gaussian residuals. Three different
Earth science datasets (contaminated with gaps and outliers) are
successfully cleaned and aggregated with ctbi. This illustrates the
robustness of this procedure that can be valuable to any discipline.
Subject
General Earth and Planetary Sciences,General Engineering,General Environmental Science
Reference34 articles.
1. Aguinis, H., Gottfredson, R. K., and Joo, H.: Best-practice recommendations
for defining, identifying, and handling outliers, Organizational Research
Methods, 16, 270–301, https://doi.org/10.1177/1094428112470848, 2013.
2. Barbato, G., Barini, E. M., Genta, G., and Levi, R.: Features and
performance of some outlier detection methods, J. Appl.
Stat., 38, 2133–2149, https://doi.org/10.1080/02664763.2010.545119, 2011.
3. Borchers, H.: Package “pracma”, https://CRAN.R-project.org/package=pracma (last access: 1 July 2022), R package version 2.4.2, 2021.
4. Box, G. E. P. and Cox, D. R.: An analysis of transformations, J.
Roy. Stat. Soc. B, 26, 211–243,
https://doi.org/10.1111/j.2517-6161.1964.tb00553.x, 1964.
5. Brys, G., Hubert, M., and Struyf, A.: A robust measure of skewness, J.
Comput. Graph. Stat., 13, 996–1017,
https://doi.org/10.1198/106186004X12632, 2004.
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献