Training Data Selection by Categorical Variables for Better Rare Event Prediction in Multiple Products Production Line-Reference-Cited by-同舟云学术

Training Data Selection by Categorical Variables for Better Rare Event Prediction in Multiple Products Production Line

Published:2022-03-28 Issue:7 Volume:11 Page:1056
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Xu Dongting,Zhang Zhisheng,Shi Jinfei^ORCID

Abstract

Manufacturers are struggling to use data from multiple products production lines to predict rare events. Improving the quality of training data is a common way to improve the performance of algorithms. However, there is little research about how to select training data quantitatively. In this study, a training data selection method is proposed to improve the performance of deep learning models. The proposed method can represent different time length multivariate time series spilt by categorical variables and measure the (dis)similarities by the distance matrix and clustering method. The contributions are: (1) The proposed method can find the changes to the training data caused by categorical variables in a multivariate time series dataset; (2) according to the proposed method, the multivariate time series data from the production line can be clustered into many small training datasets; and (3) same structure but different parameters prediction models are built instead of one model which is different from the traditional way. In practice, the proposed method is applied in a real multiple products production line dataset and the result shows it can not only significantly improve the performance of the reconstruction model but it can also quantitively measure the (dis)similarities of the production behaviors.

Funder

National Natural Science Foundation of China

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering

Link

https://www.mdpi.com/2079-9292/11/7/1056/pdf

Reference20 articles.

1. The Way We Train AI Is Fundamentally Flawed;Heaven,2020

2. If Your Data Is Bad, Your Machine Learning Tools Are Useless;Redman,2018

3. Understanding Deep Learning and Applications on Rare Event Prediction;Ranjan,2020

4. Early failure detection of paper manufacturing machinery using nearest neighbor‐based feature extraction

5. SMOTE: Synthetic Minority Over-sampling Technique

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. MMA: metadata supported multi-variate attention for onset detection and prediction;Data Mining and Knowledge Discovery;2024-02-19

2. A Data Quality Assessment and Control Method in Multiple Products Manufacturing Process;2022 5th International Conference on Data Science and Information Technology (DSIT);2022-07-22

3. A New Multi-Sensor Stream Data Augmentation Method for Imbalanced Learning in Complex Manufacturing Process;Sensors;2022-05-26