Abstract
PurposeThe common methods for clustering time series are the use of specific distance criteria or the use of standard clustering algorithms. Ensemble clustering is one of the common techniques used in data mining to increase the accuracy of clustering. In this study, based on segmentation, selecting the best segments, and using ensemble clustering for selected segments, a multistep approach has been developed for the whole clustering of time series data.Design/methodology/approachFirst, this approach divides the time series dataset into equal segments. In the next step, using one or more internal clustering criteria, the best segments are selected, and then the selected segments are combined for final clustering. By using a loop and how to select the best segments for the final clustering (using one criterion or several criteria simultaneously), two algorithms have been developed in different settings. A logarithmic relationship limits the number of segments created in the loop.FindingAccording to Rand's external criteria and statistical tests, at first, the best setting of the two developed algorithms has been selected. Then this setting has been compared to different algorithms in the literature on clustering accuracy and execution time. The obtained results indicate more accuracy and less execution time for the proposed approach.Originality/valueThis paper proposed a fast and accurate approach for time series clustering in three main steps. This is the first work that uses a combination of segmentation and ensemble clustering. More accuracy and less execution time are the remarkable achievements of this study.
Subject
Library and Information Sciences,Information Systems
Reference50 articles.
1. A new methodology for customer behavior analysis using time series clustering: a case study on a bank's customers;Kybernetes,2019
2. A hybrid algorithm for clustering of time series data based on affinity search technique;The Scientific World Journal,2014
3. Cluster ensemble based on Random Forests for genetic data;BioData Mining,2017
4. Semi‐supervised clustering methods;Wiley Interdisciplinary Reviews: Computational Statistics,2013
5. Time series clustering: a complex network-based approach for feature selection in multi-sensor data;Modelling,2020
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献