Experience: A Comparative Analysis of Multivariate Time-Series Generative Models: A Case Study on Human Activity Data-Reference-Cited by-同舟云学术

Experience: A Comparative Analysis of Multivariate Time-Series Generative Models: A Case Study on Human Activity Data

Published:2024-08-20 Issue: Volume: Page:
ISSN:1936-1955
Container-title:Journal of Data and Information Quality
language:en
Short-container-title:J. Data and Information Quality

Author:

Alzahrani Naif¹^ORCID,Cała Jacek²^ORCID,Missier Paolo¹^ORCID

Affiliation:

1. School of Computing, Newcastle University, Newcastle upon Tyne, United Kingdom of Great Britain and Northern Ireland

2. National Innovation Centre for Data at Newcastle University, Newcastle upon Tyne United Kingdom of Great Britain and Northern Ireland

Abstract

Human activity recognition (HAR) is an active research field that has seen great success in recent years due to advances in sensory data collection methods and activity recognition systems. Deep artificial intelligence (AI) models have contributed to the success of HAR systems lately although still suffering from limitations such as data scarcity, the high costs of labelling data instances and datasets’ imbalance and bias. The temporal nature of human activity data, represented as time series data, impose an additional challenge to using AI models in HAR because most state-of-the-art models do not account for the time component of the data instances. These limitations have inspired the time-series research community to design generative models for sequential data but very little work has been done to evaluate the quality of such models. In this work, we conduct a comparative quality analysis of three generative models for time-series data, using a case study in which we aim to generate sensory human activity data from a seed public dataset. Additionally, we adapt and clearly explain four evaluation methods of synthetic time-series data from the literature and apply them to assess the quality of the synthetic activity data we generate. We show experimentally that high quality human activity data can be generated using deep generative models, and the synthetic data can thus be used in HAR systems to augment real activity data. We also demonstrate that the chosen evaluation methods effectively ensure that the generated data meets the essential quality benchmarks of realism, diversity, coherence and utility. Our findings suggest that using deep generative models to produce synthetic human activity data can potentially address challenges related to data scarcity, biases, and expensive labeling. This holds promise for enhancing the efficiency and reliability of HAR systems.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3688393

Reference49 articles.

1. Moustafa Alzantot, Supriyo Chakraborty, and Mani B. Srivastava. 2017. SenseGen: A Deep Learning Architecture for Synthetic Sensor Data Generation. arXiv:1701.08886 [cs] (Jan. 2017). http://arxiv.org/abs/1701.08886 arXiv: 1701.08886.

2. Martin Arjovsky and Léon Bottou. 2017. Towards Principled Methods for Training Generative Adversarial Networks. https://doi.org/10.48550/arXiv.1701.04862 arXiv:1701.04862 [cs stat].

3. Martin Arjovsky, Soumith Chintala, and Léon Bottou. 2017. Wasserstein Generative Adversarial Networks. In Proceedings of the 34th International Conference on Machine Learning. PMLR, 214–223. https://proceedings.mlr.press/v70/arjovsky17a.html ISSN: 2640-3498.

4. Evaluation of Generative Adversarial Networks for Time Series Data

5. Evaluation of Generative Adversarial Networks for Time Series Data