Abstract
Background: Internet of Things (IoT) edge analytics enables data computation and storage to be available adjacent to the source of data generation at the IoT system. This method improves sensor data handling and speeds up analysis, prediction, and action. Using machine learning for analytics and task offloading in edge servers could minimise latency and energy usage. However, one of the key challenges in using machine learning in edge analytics is to find a real-world dataset to implement a more representative predictive model. This challenge has undeniably slowed down the adoption of machine learning methods in IoT edge analytics. Thus, the generation of realistic synthetic datasets can leverage the need to speed up methodological use of machine learning in edge analytics. Methods: We create synthetic data with features that are like data from IoT devices. We use an existing air quality dataset that includes temperature and gas sensor measurements. This real-time dataset includes component values for the Air Quality Index (AQI) and ppm concentrations for various polluting gases. We build a JavaScript Object Notation (JSON) model to capture the distribution of variables and the structure of this real dataset to generate the synthetic data. Based on the synthetic dataset and original dataset, we create a comparative predictive model. Results: Analysis of synthetic dataset predictive model shows that it can be successfully used for edge analytics purposes, replacing real-world datasets. There is no significant difference between the real-world dataset compared the synthetic dataset. The generated synthetic data requires no modification to suit the edge computing requirements. Conclusions: The framework can generate representative synthetic datasets based on JSON schema attributes. The accuracy, precision, and recall values for the real and synthetic datasets indicate that the logistic regression model is capable of successfully classifying data.
Funder
Ministry of Higher Education (MOHE) Fundamental Research Grant Scheme
Subject
General Pharmacology, Toxicology and Pharmaceutics,General Immunology and Microbiology,General Biochemistry, Genetics and Molecular Biology,General Medicine
Reference21 articles.
1. Sensegen: A deep learning architecture for synthetic sensor data generation.;M Alzantot;2017 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops).,2017
2. Synthetic data generation for the internet of things.;J Anderson;Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014.,2014
3. Can synthetic data be a proxy for real clinical trial data? A validation study.;Z Azizi;BMJ Open.,2021
4. The dual effects of the Internet of Things (IoT): A systematic review of the benefits and risks of IoT adoption by organizations.;P Brous;International Journal of Information Management.,2020
5. IoT Microservice Deployment in Edge-cloud Hybrid Environment Using Reinforcement Learning.;L Chen;IEEE Internet of Things Journal.,2020
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献