Abstract
AbstractCapturing the dynamical properties of time series concisely as interpretable feature vectors can enable efficient clustering and classification for time-series applications across science and industry. Selecting an appropriate feature-based representation of time series for a given application can be achieved through systematic comparison across a comprehensive time-series feature library, such as those in the hctsa toolbox. However, this approach is computationally expensive and involves evaluating many similar features, limiting the widespread adoption of feature-based representations of time series for real-world applications. In this work, we introduce a method to infer small sets of time-series features that (i) exhibit strong classification performance across a given collection of time-series problems, and (ii) are minimally redundant. Applying our method to a set of 93 time-series classification datasets (containing over 147 000 time series, including biomedical datasets) and using a filtered version of the hctsa feature library (4791 features), we introduce a generically useful set of 22 CAnonical Time-series CHaracteristics, catch22. This dimensionality reduction, from 4791 to 22, is associated with an approximately 1000-fold reduction in computation time and near linear scaling with time-series length, despite an average reduction in classification accuracy of just 7%. catch22 captures a diverse and interpretable signature of time series in terms of their properties, including linear and non-linear autocorrelation, successive differences, value distributions and outliers, and fluctuation scaling properties. We provide an efficient implementation of catch22, accessible from many programming environments, that facilitates feature-based time-series analysis for scientific, industrial, financial and medical applications using a common language of interpretable time-series properties.
Publisher
Cold Spring Harbor Laboratory
Reference33 articles.
1. Bagnall, A. , Davis, L.M. , Hills, J. , Lines, J. : Transformation based ensembles for time series classification. In: Proceedings of the 2012 SIAM International conference on data mining, pp. 307–318 (2012)
2. The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances
3. Time-series classification with COTE: The collective of transformation-based ensembles
4. Bagnall, A. , Lines, J. , Vickers, W. , Keogh, E. : The UEA & UCR Time Series Classification Repository. URL http://www.timeseriesclassification.com/
5. Diabetes in Tanzania
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献