Affiliation:
1. Institute of Engineering and Technology, Lucknow 226021, India
Abstract
To obtain high performance, generalization, and accuracy in machine learning applications, such as prediction or anomaly detection, large datasets are a necessary prerequisite. Moreover, the collection of data is time-consuming, difficult, and expensive for many imbalanced or small datasets. These challenges are evident in collecting data for financial and banking services, pharmaceuticals and healthcare, manufacturing and the automobile, robotics car, sensor time-series data, and many more. To overcome the challenges of data collection, researchers in many domains are becoming more and more interested in the development or generation of synthetic data. Generating synthetic time-series data is far more complicated and expensive than generating synthetic tabular data. The primary objective of the paper is to generate multivariate time-series data (for continuous and mixed parameters) that are comparable and evaluated with real multivariate time-series synthetic data. After being trained to produce such data, a novel GAN architecture named as MTS-TGAN is proposed and then assessed using both qualitative measures namely t-SNE, PCA, discriminative and predictive scores as well as quantitative measures, for which an RNN model is implemented, which calculates MAE and MSLE scores for three training phases; Train Real Test Real, Train Real Test Synthetic and Train Synthetic Test Real. The model is able to reduce the overall error up to 13% and 10% in predictive and discriminative scores, respectively. The research’s objectives are met, and the outcomes demonstrate that MTS-TGAN is able to pick up on the distribution and underlying knowledge included in the attributes of the real data and it can serve as a starting point for additional research in the respective area.
Funder
Department of Science & Technology
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference40 articles.
1. Soltana, G., Sabetzadeh, M., and Bri, L.C. (November, January 30). Synthetic data generation for statistical testing. Proceedings of the 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE), Urbana, IL, USA.
2. (2023, January 25). Synthetic Data: The Complete Guide, Datagen. Available online: https://datagen.tech/guides/synthetic-data/synthetic-data/.
3. DAuGAN: An Approach for Augmenting Time Series Imbalanced Datasets via Latent Space Sampling Using Adversarial Techniques;Bratu;Sci. Program.,2021
4. Generative adversarial networks;Goodfellow;Commun. ACM,2020
5. Generative adversarial networks: An overview;Creswell;IEEE Signal Process. Mag.,2018
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献