Leakage Prediction in Machine Learning Models When Using Data from Sports Wearable Sensors-Reference-Cited by-同舟云学术

Leakage Prediction in Machine Learning Models When Using Data from Sports Wearable Sensors

Published:2022-05-17 Issue: Volume:2022 Page:1-9
ISSN:1687-5273
Container-title:Computational Intelligence and Neuroscience
language:en
Short-container-title:Computational Intelligence and Neuroscience

Author:

Dong Qizheng¹^ORCID

Affiliation:

1. Zhengzhou University of Science and Technology, Zhengzhou, Henan 450000, China

Abstract

One of the major problems in machine learning is data leakage, which can be directly related to adversarial type attacks, raising serious concerns about the validity and reliability of artificial intelligence. Data leakage occurs when the independent variables used to teach the machine learning algorithm include either the dependent variable itself or a variable that contains clear information that the model is trying to predict. This data leakage results in unreliable and poor predictive results after the development and use of the model. It prevents the model from generalizing, which is required in a machine learning problem and thus causes false assumptions about its performance. To have a solid and generalized forecasting model, which will be able to produce remarkable forecasting results, we must pay great attention to detecting and preventing data leakage. This study presents an innovative system of leakage prediction in machine learning models, which is based on Bayesian inference to produce a thorough approach to calculating the reverse probability of unseen variables in order to make statistical conclusions about the relevant correlated variables and to calculate accordingly a lower limit on the marginal likelihood of the observed variables being derived from some coupling method. The main notion is that a higher marginal probability for a set of variables suggests a better fit of the data and thus a greater likelihood of a data leak in the model. The methodology is evaluated in a specialized dataset derived from sports wearable sensors.

Publisher

Hindawi Limited

Subject

General Mathematics,General Medicine,General Neuroscience,General Computer Science

Link

http://downloads.hindawi.com/journals/cin/2022/5314671.pdf

Reference54 articles.

1. A dynamic ensemble learning algorithm for neural networks

2. A Survey of Uncertainty in Deep Neural Networks;J. Gawlikowski,2021

3. A Lipschitz - Shapley Explainable Defense Methodology Against Adversarial Attacks

4. Polymorphic Adversarial DDoS attack on IDS using GAN

5. Adversarial attack on DL-based massive MIMO CSI feedback

Cited by 14 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Exposing Data Leakage in Wi-Fi CSI-Based Human Action Recognition: A Critical Analysis;Inventions;2024-08-15

2. Towards robust and efficient intrusion detection in IoMT: a deep learning approach addressing data leakage and enhancing model generalizability;Multimedia Tools and Applications;2024-07-30

3. Implications of Data Leakage in Machine Learning Preprocessing: A Multi-Domain Investigation;2024-07-12

4. Predicting Response to Neuromodulators or Prokinetics in Patients With Suspected Gastroparesis Using Machine Learning: The “BMI, Infectious Prodrome, Delayed GES, and No Diabetes” Model;Clinical and Translational Gastroenterology;2024-07-01

5. An empirical assessment of ML models for 5G network intrusion detection: A data leakage-free approach;e-Prime - Advances in Electrical Engineering, Electronics and Energy;2024-06