Affiliation:
1. School of Computing Technologies, RMIT, Melbourne, VIC 3000, Australia
2. Coles, Melbourne, VIC 3123, Australia
3. Department of Computer Science, University of North Carolina, Charlotte, NC 28223, USA
Abstract
Medical time series are sequential data collected over time that measures health-related signals, such as electroencephalography (EEG), electrocardiography (ECG), and intensive care unit (ICU) readings. Analyzing medical time series and identifying the latent patterns and trends that lead to uncovering highly valuable insights for enhancing diagnosis, treatment, risk assessment, and disease progression. However, data mining in medical time series is heavily limited by the sample annotation which is time-consuming and labor-intensive, and expert-depending. To mitigate this challenge, the emerging self-supervised contrastive learning, which has shown great success since 2020, is a promising solution. Contrastive learning aims to learn representative embeddings by contrasting positive and negative samples without the requirement for explicit labels. Here, we conducted a systematic review of how contrastive learning alleviates the label scarcity in medical time series based on PRISMA standards. We searched the studies in five scientific databases (IEEE, ACM, Scopus, Google Scholar, and PubMed) and retrieved 1908 papers based on the inclusion criteria. After applying excluding criteria, and screening at title, abstract, and full text levels, we carefully reviewed 43 papers in this area. Specifically, this paper outlines the pipeline of contrastive learning, including pre-training, fine-tuning, and testing. We provide a comprehensive summary of the various augmentations applied to medical time series data, the architectures of pre-training encoders, the types of fine-tuning classifiers and clusters, and the popular contrastive loss functions. Moreover, we present an overview of the different data types used in medical time series, highlight the medical applications of interest, and provide a comprehensive table of 51 public datasets that have been utilized in this field. In addition, this paper will provide a discussion on the promising future scopes such as providing guidance for effective augmentation design, developing a unified framework for analyzing hierarchical time series, and investigating methods for processing multimodal data. Despite being in its early stages, self-supervised contrastive learning has shown great potential in overcoming the need for expert-created annotations in the research of medical time series.
Funder
National Science Foundation
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Reference138 articles.
1. Spathis, D., Perez-Pozuelo, I., Brage, S., Wareham, N.J., and Mascolo, C. (2020). Learning generalizable physiological representations from large-scale wearable data. arXiv.
2. Che, Z., Cheng, Y., Zhai, S., Sun, Z., and Liu, Y. (2017, January 18–21). Boosting deep learning risk prediction with generative adversarial networks for electronic health records. Proceedings of the 2017 IEEE International Conference on Data Mining (ICDM), New Orleans, LA, USA.
3. Systematic review of smartphone-based passive sensing for health and wellbeing;Cornet;J. Biomed. Inform.,2018
4. Time series prediction using deep learning methods in healthcare;Morid;ACM Trans. Manag. Inf. Syst.,2023
5. Multitask learning and benchmarking with clinical time series data;Harutyunyan;Sci. Data,2019
Cited by
14 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献