Water quality prediction: a data-driven approach exploiting advanced machine learning algorithms with data augmentation

Author:

K Karthick1ORCID,Krishnan S.2,Manikandan R.3

Affiliation:

1. a Department of Electrical and Electronics Engineering, GMR Institute of Technology, Rajam, Andhra Pradesh, India

2. b Department of EEE, Mahendra Engineering College (Autonomous), Namakkal, Tamil Nadu, India

3. c Department of ECE, Panimalar Engineering College, Chennai, India

Abstract

Abstract Water quality assessment plays a crucial role in various aspects, including human health, environmental impact, agricultural productivity, and industrial processes. Machine learning (ML) algorithms offer the ability to automate water quality evaluation and allow for effective and rapid assessment of parameters associated with water quality. This article proposes an ML-based classification model for water quality prediction. The model was tested with 14 ML algorithms and considers 20 features that represent various substances present in water samples and their concentrations. The dataset used in the study comprises 7,996 samples, and the model development involves several stages, including data preprocessing, Yeo–Johnson transformation for data normalization, principal component analysis (PCA) for feature selection, and the application of the synthetic minority over-sampling technique (SMOTE) to address class imbalance. Performance metrics, such as accuracy, precision, recall, and F1 score, are provided for each algorithm with and without SMOTE. LightGBM, XGBoost, CatBoost, and Random Forest were identified as the best-performing algorithms. XGBoost achieved the highest accuracy of 96.31% without SMOTE and had a precision of 0.933. The application of SMOTE enhanced the performance of CatBoost. These findings provide valuable insights for ML-based water quality assessment, aiding researchers and professionals in decision-making and management.

Publisher

IWA Publishing

Subject

Management, Monitoring, Policy and Law,Atmospheric Science,Water Science and Technology,Global and Planetary Change

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3