Effective Outlier Detection for Ensuring Data Quality in Flotation Data Modelling Using Machine Learning (ML) Algorithms-Reference-Cited by-同舟云学术

Effective Outlier Detection for Ensuring Data Quality in Flotation Data Modelling Using Machine Learning (ML) Algorithms

Published:2024-09-10 Issue:9 Volume:14 Page:925
ISSN:2075-163X
Container-title:Minerals
language:en
Short-container-title:Minerals

Author:

Lartey Clement¹²^ORCID,Liu Jixue²^ORCID,Asamoah Richmond K.¹^ORCID,Greet Christopher³,Zanin Massimiliano¹⁴^ORCID,Skinner William¹^ORCID

Affiliation:

1. Future Industries Institute, University of South Australia, Adelaide, SA 5095, Australia

2. UniSA STEM, University of South Australia, Adelaide, SA 5095, Australia

3. Magotteaux Australia Pty. Ltd., Wingfield, Adelaide, SA 5013, Australia

4. School of Chemical Engineering, The University of Adelaide, Adelaide, SA 5005, Australia

Abstract

Froth flotation, a widely used mineral beneficiation technique, generates substantial volumes of data, offering the opportunity to extract valuable insights from these data for production line analysis. The quality of flotation data is critical to designing accurate prediction models and process optimisation. Unfortunately, industrial flotation data are often compromised by quality issues such as outliers that can produce misleading or erroneous analytical results. A general approach is to preprocess the data by replacing or imputing outliers with data values that have no connection with the real state of the process. However, this does not resolve the effect of outliers, especially those that deviate from normal trends. Outliers often occur across multiple variables, and their values may occur in normal observation ranges, making their detection challenging. An unresolved challenge in outlier detection is determining how far an observation must be to be considered an outlier. Existing methods rely on domain experts’ knowledge, which is difficult to apply when experts encounter large volumes of data with complex relationships. In this paper, we propose an approach to conduct outlier analysis on a flotation dataset and examine the efficacy of multiple machine learning (ML) algorithms—including k-Nearest Neighbour (kNN), Local Outlier Factor (LOF), and Isolation Forest (ISF)—in relation to the statistical 2σ rule for identifying outliers. We introduce the concept of “quasi-outliers” determined by the 2σ threshold as a benchmark for assessing the ML algorithms’ performance. The study also analyses the mutual coverage between quasi-outliers and outliers from the ML algorithms to identify the most effective outlier detection algorithm. We found that the outliers by kNN cover outliers of other methods. We use the experimental results to show that outliers affect model prediction accuracy, and excluding outliers from training data can reduce the average prediction errors.

Funder

Australian Research Council Integrated Operations for Complex Resources Industrial Transformation Training Centre

universities, industry and the Australian Government

Publisher

MDPI AG

Link

https://www.mdpi.com/2075-163X/14/9/925/pdf

Reference60 articles.

1. Fundamentals of froth flotation;Pawlik;ChemTexts,2022

2. Wills, B.A., and Finch, J.A. (2015). Froth flotation. Wills’ Mineral Processing Technology: An Introduction to the Practical Aspects of Ore Treatment and Mineral Recovery, Elsevier. [8th ed.]. Chapter 12.

3. Analysis of extreme values;Dixon;Ann. Math. Stat.,1950

4. Devavarapu, Y., Bedadhala, R.R., Shaik, S.S., Pendela, C.R.K., and Ashesh, K. (2024, January 21–23). Credit Card Fraud Detection Using Outlier Analysis and Detection. Proceedings of the 2024 4th International Conference on Intelligent Technologies (CONIT), Bali, Indonesia.

5. Anomaly based network intrusion detection with unsupervised outlier detection;Zhang;Proceedings of the 2006 IEEE International Conference on Communications,2006