Abstract
Natural gas (NG), typically methane, is released into the air, causing significant air pollution and environmental and health problems. Nowadays, there is a need to use machine-based methods to predict gas losses widely. In this article, we proposed to predict NG leakage levels through feature selection based on a factorial analysis (FA) of the USA’s urban natural gas open data. The paper has been divided into three sections. First, we select essential features using FA. Then, the dataset is labeled by k-means clustering with OrdinalEncoder (OE)-based normalization. The final module uses five algorithms (extreme gradient boost (XGBoost), K-nearest neighbors (KNN), decision tree (DT), random forest (RF), Naive Bayes (NB), and multilayer perceptron (MLP)) to predict gas leakage levels. The proposed method is evaluated by the accuracy, F1-score, mean standard error (MSE), and area under the ROC curve (AUC). The test results indicate that the F-OE-based classification method has improved successfully. Moreover, F-OE-based XGBoost (F-OE-XGBoost) showed the best performance by giving 95.14% accuracy, an F1-score of 95.75%, an MSE of 0.028, and an AUC of 96.29%. Following these, the second-best outcomes of an accuracy rate of 95.09%, F1-score of 95.60%, MSE of 0.029, and AUC of 96.11% were achieved by the F-OE-RF model.
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献