Learning Representations for Incomplete Time Series Clustering

Author:

Ma Qianli,Chen Chuxin,Li Sen,Cottrell Garrison W.

Abstract

Time-series clustering is an essential unsupervised technique for data analysis, applied to many real-world fields, such as medical analysis and DNA microarray. Existing clustering methods are usually based on the assumption that the data is complete. However, time series in real-world applications often contain missing values. Traditional strategy (imputing first and then clustering) does not optimize the imputation and clustering process as a whole, which not only makes per- formance dependent on the combination of imputation and clustering methods but also fails to achieve satisfactory re- sults. How to best improve the clustering performance on incomplete time series remains a challenge. This paper pro- poses a novel unsupervised temporal representation learning model, named Clustering Representation Learning on Incom- plete time-series data (CRLI). CRLI jointly optimizes the im- putation and clustering process to impute more discrimina- tive values for clustering and make the learned representa- tions possessed good clustering property. Also, to reduce the error propagation from imputation to clustering, we introduce a discriminator to make the distribution of imputation values close to the true one and train CRLI in an alternating train- ing manner. An experiment conducted on eight real-world in- complete time-series datasets shows that CRLI outperforms existing methods. We demonstrates the effectiveness of the learned representations and the convergence of the model through visualization analysis. Moreover, we reveal that the joint training strategy can impute values close to the true ones in those important sub-sequences, and impute more discrim- inative values in those less important sub-sequences at the same time, making the imputed sequence cluster-friendly.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 13 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Self-Supervised Learning for Time Series Analysis: Taxonomy, Progress, and Prospects;IEEE Transactions on Pattern Analysis and Machine Intelligence;2024-10

2. TsCDD-GAN: A Conditional Dual-Discriminator Generative Adversarial Network for Incomplete Time Series Data Imputation and Clustering;2024 5th International Conference on Computer Vision, Image and Deep Learning (CVIDL);2024-04-19

3. Prior knowledge-augmented unsupervised shapelet learning for unknown abnormal working condition discovery in industrial process;Advanced Engineering Informatics;2024-04

4. TCHA: Contrastive Learning via Hybrid Granularities and Adaptive Sampling for Multivariate Time Series;2023 IEEE International Conference on High Performance Computing & Communications, Data Science & Systems, Smart City & Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys);2023-12-17

5. ExpertNet: A Deep Learning Approach to Combined Risk Modeling and Subtyping in Intensive Care Units;IEEE Journal of Biomedical and Health Informatics;2023-10

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3