Clustering-based adaptive data augmentation for class-imbalance in machine learning (CADA): additive manufacturing use case-Reference-Cited by-同舟云学术

Clustering-based adaptive data augmentation for class-imbalance in machine learning (CADA): additive manufacturing use case

Published:2022-05-23 Issue: Volume: Page:
ISSN:0941-0643
Container-title:Neural Computing and Applications
language:en
Short-container-title:Neural Comput & Applic

Author:

Dasari Siva Krishna^ORCID,Cheddad Abbas,Palmquist Jonatan,Lundberg Lars

Abstract

AbstractLarge amount of data are generated from in-situ monitoring of additive manufacturing (AM) processes which is later used in prediction modelling for defect classification to speed up quality inspection of products. A high volume of this process data is defect-free (majority class) and a lower volume of this data has defects (minority class) which result in the class-imbalance issue. Using imbalanced datasets, classifiers often provide sub-optimal classification results, i.e. better performance on the majority class than the minority class. However, it is important for process engineers that models classify defects more accurately than the class with no defects since this is crucial for quality inspection. Hence, we address the class-imbalance issue in manufacturing process data to support in-situ quality control of additive manufactured components. For this, we propose cluster-based adaptive data augmentation (CADA) for oversampling to address the class-imbalance problem. Quantitative experiments are conducted to evaluate the performance of the proposed method and to compare with other selected oversampling methods using AM datasets from an aerospace industry and a publicly available casting manufacturing dataset. The results show that CADA outperformed random oversampling and the SMOTE method and is similar to random data augmentation and cluster-based oversampling. Furthermore, the results of the statistical significance test show that there is a significant difference between the studied methods. As such, the CADA method can be considered as an alternative method for oversampling to improve the performance of models on the minority class.

Funder

Blekinge Institute of Technology

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Software

Link

https://link.springer.com/content/pdf/10.1007/s00521-022-07347-6.pdf

Reference34 articles.

1. Abouelenien M, Yuan X, Giritharan B, Liu J, Tang S (2013) Cluster-based sampling and ensemble for bleeding detection in capsule endoscopy videos. Am J Sci Eng 2(1):24–32

2. Bach M, Werner A, Żywiec J, Pluskiewicz W (2017) The study of under-and over-sampling methods’ utility in analysis of highly imbalanced data on osteoporosis. Inf Sci 384:174–190

3. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

4. Caggiano A, Zhang J, Alfieri V, Caiazzo F, Gao R, Teti R (2019) Machine learning-based image processing for on-line defect recognition in additive manufacturing. CIRP Ann 68(1):451–454

5. Cateni S, Colla V, Vannucci M (2014) A method for resampling imbalanced datasets in binary classification tasks for real-world problems. Neurocomputing 135:32–41

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Neural network prediction of thermal field spatiotemporal evolution during additive manufacturing: an overview;The International Journal of Advanced Manufacturing Technology;2024-08-20

2. Leveraging small-scale datasets for additive manufacturing process modeling and part certification: Current practice and remaining gaps;Journal of Manufacturing Systems;2024-08

3. Prediction of dementia based on older adults’ sleep disturbances using machine learning;Computers in Biology and Medicine;2024-03

4. Systematic review of class imbalance problems in manufacturing;Journal of Manufacturing Systems;2023-12

5. Application of Machine Learning to Monitor Metal Powder-Bed Fusion Additive Manufacturing Processes;Additive Manufacturing Design and Applications;2023-06-30