GeMSyD: Generic Framework for Synthetic Data Generation

Author:

Tolas Ramona1ORCID,Portase Raluca1ORCID,Potolea Rodica1ORCID

Affiliation:

1. Computer Science Department, Technical University of Cluj Napoca, 400114 Cluj-Napoca, Romania

Abstract

In the era of data-driven technologies, the need for diverse and high-quality datasets for training and testing machine learning models has become increasingly critical. In this article, we present a versatile methodology, the Generic Methodology for Constructing Synthetic Data Generation (GeMSyD), which addresses the challenge of synthetic data creation in the context of smart devices. GeMSyD provides a framework that enables the generation of synthetic datasets, aligning them closely with real-world data. To demonstrate the utility of GeMSyD, we instantiate the methodology by constructing a synthetic data generation framework tailored to the domain of event-based data modeling, specifically focusing on user interactions with smart devices. Our framework leverages GeMSyD to create synthetic datasets that faithfully emulate the dynamics of human–device interactions, including the temporal dependencies. Furthermore, we showcase how the synthetic data generated using our framework can serve as a valuable resource for machine learning practitioners. By employing these synthetic datasets, we perform a series of experiments to evaluate the performance of a neural-network-based prediction model in the domain of smart device interaction. Our results underscore the potential of synthetic data in facilitating model development and benchmarking.

Publisher

MDPI AG

Subject

Information Systems and Management,Computer Science Applications,Information Systems

Reference46 articles.

1. Physics-Informed LSTM hyperparameters selection for gearbox fault detection;Chen;Mech. Syst. Signal Process.,2022

2. Semi-supervised adversarial discriminative learning approach for intelligent fault diagnosis of wind turbine;Han;Inf. Sci.,2023

3. Advanced household profiling using digital water meters;Rahim;J. Environ. Manag.,2021

4. Water Demand Pattern Classification from Smart Meter Data;McKenna;Procedia Eng.,2014

5. (2020). Artificial Intelligence in Health Care: Benefits and Challenges of Machine Learning in Drug Development, (STAA)-Policy Briefs & Reports-EPTA.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3