Ensuring the Robustness and Reliability of Data-Driven Knowledge Discovery Models in Production and Manufacturing-Reference-Cited by-同舟云学术

Ensuring the Robustness and Reliability of Data-Driven Knowledge Discovery Models in Production and Manufacturing

Published:2021-06-14 Issue: Volume:4 Page:
ISSN:2624-8212
Container-title:Frontiers in Artificial Intelligence
language:
Short-container-title:Front. Artif. Intell.

Author:

Tripathi Shailesh,Muhr David,Brunner Manuel,Jodlbauer Herbert,Dehmer Matthias,Emmert-Streib Frank

Abstract

The Cross-Industry Standard Process for Data Mining (CRISP-DM) is a widely accepted framework in production and manufacturing. This data-driven knowledge discovery framework provides an orderly partition of the often complex data mining processes to ensure a practical implementation of data analytics and machine learning models. However, the practical application of robust industry-specific data-driven knowledge discovery models faces multiple data- and model development-related issues. These issues need to be carefully addressed by allowing a flexible, customized and industry-specific knowledge discovery framework. For this reason, extensions of CRISP-DM are needed. In this paper, we provide a detailed review of CRISP-DM and summarize extensions of this model into a novel framework we call Generalized Cross-Industry Standard Process for Data Science (GCRISP-DS). This framework is designed to allow dynamic interactions between different phases to adequately address data- and model-related issues for achieving robustness. Furthermore, it emphasizes also the need for a detailed business understanding and the interdependencies with the developed models and data quality for fulfilling higher business objectives. Overall, such a customizable GCRISP-DS framework provides an enhancement for model improvements and reusability by minimizing robustness-issues.

Publisher

Frontiers Media SA

Reference143 articles.

1. Transposable regularized covariance models with an application to missing data imputation;Allen;Ann. Appl. Stat.,2010

2. Power to the people: the role of humans in interactive machine learning;Amershi;AI. Magazine,2014

3. Big data visualization and analytics: future research challenges and emerging applications AndrienkoG. AndrienkoN. DruckerS. FeketeJ-D. FisherD. IdreosS. 2020

4. Context-aware data quality assessment for big data;Ardagna;Future Generation Comput. Syst.,2018

5. A survey on unsupervised outlier detection in high-dimensional numerical data;Arthur;Stat. Anal. Data Mining,2012

Cited by 25 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Reliability-improved machine learning model using knowledge-embedded learning approach for smart manufacturing;Journal of Intelligent Manufacturing;2024-09-09

2. Quality of care assessment for non-small cell lung cancer patients: transforming routine care data into a continuous improvement system;Clinical and Translational Oncology;2024-08-16

3. Information exchange and knowledge discovery for additive manufacturing digital thread: a comprehensive literature review;International Journal of Computer Integrated Manufacturing;2024-08-14

4. Analyzing Enrolment Patterns: Modified Stacked Ensemble Statistical Learning- Based Approach to Educational Decision-Making;Akademika;2024-07-31

5. Opportunities in Neural Networks for Industry 4.0;Topics in Artificial Intelligence Applied to Industry 4.0;2024-04-05