Prediction of Biogas Production Volumes from Household Organic Waste Based on Machine Learning

Author:

Tryhuba Inna1,Tryhuba Anatoliy12ORCID,Hutsol Taras3ORCID,Cieszewska Agata4ORCID,Andrushkiv Oleh5,Glowacki Szymon6ORCID,Bryś Andrzej6ORCID,Slobodian Sergii7,Tulej Weronika6ORCID,Sojak Mariusz6ORCID

Affiliation:

1. Department of Information Technologies, Lviv National Environmental University, 80-381 Dublyany, Ukraine

2. Ukrainian University in Europe—Foundation, Balicka 116, 30-149 Krakow, Poland

3. Department of Mechanics and Agroecosystems Engineering, Polissia National University, 10-008 Zhytomyr, Ukraine

4. Department of Landscape Architecture, Warsaw University of Life Sciences, Nowoursynowska 159, 02-787 Warsaw, Poland

5. Department of Information Technologies, Lviv State University of Life Safety, 79-007 Lviv, Ukraine

6. Department of Fundamentals of Engineering and Power Engineering, Institute of Mechanical Engineering, Warsaw University of Life Sciences (SGGW), 02-787 Warsaw, Poland

7. Department of Information Technology, Physical, Mathematical and Civil Defence Disciplines, Faculty of Energy and Information Technologies, Higher Educational Institution “Podillia State University”, 32-300 Kamianets-Podilskyi, Ukraine

Abstract

The article proposes to use machine learning as one of the areas of artificial intelligence to forecast the volume of biogas production from household organic waste. The use of five regression algorithms (Linear Regression, Ridge Regression, Lasso Regression, Random Forest Regression, and Gradient Boosting Regression) to create an effective model for forecasting the volume of biogas production from household organic waste is considered. Based on the comparison of these algorithms by MSE and MAE indicators, the quality of training and their accuracy during forecasting are evaluated. The proposed algorithm for creating a model for forecasting biogas production volumes from household organic waste involves the implementation of 10 main and 3 auxiliary steps. Their advantage is that they aid in the performance of component data analysis, which is carried out based on the method of reducing the dimensionality of the data set, increasing interpretability, and minimizing the risk of data loss. An analysis of 2433 data is was carried out, which characterizes the formation of biogas from food (FW) and yard waste (YW) according to four features. Data preparation is performed using the Jupyter Notebook environment in Python. We select five machine learning algorithms to substantiate an effective model for forecasting volumes of biogas production from household organic waste. On the basis of the conducted research, the main advantages and disadvantages of the used algorithms for building forecasting models of biogas production volumes from household organic waste are determined. It is found that two models, “Random Forest Regressor” and “Gradient Boosting Regressor”, show the best accuracy indicators. The other three models (Linear Regression, Ridge Regression, Lasso Regression) are inferior in accuracy and were not considered further. To determine the accuracy of the “Random Forest Regressor” and “Gradient Boosting Regressor” models, we choose the MSE and MAE indicators. The Random Forest Regressor model is found to be a more accurate model compared to the Gradient Boosting Regressor. This is confirmed by the fact that the MSE of the “Random Forest Regressor” model on the training data set is 7.14 times smaller than that of the “Gradient Boosting Regressor” model. At the same time, MAE is 2.67 times smaller in the “Random Forest Regressor” model than in the “Gradient Boosting Regressor” model. The MSE and MAE of both models are worse on the test data set, which indicates overtraining tendencies. The Gradient Boosting Regressor model has worse MSE and MAE than the Random Forest Regressor model on both the training and test data sets. It is established that the model based on the “Random Forest Regressor” algorithm is the most effective for forecasting the volume of biogas production from household organic waste. It provides MAE = 0.088 on test data and the smallest absolute errors in predictions. Further systematic improvement of the “Random Forest Regressor” model for forecasting biogas production volumes from household organic waste based on new data will ensure its accuracy and maintain competitive advantages.

Funder

Science development fund of the Warsaw University of Life Sciences–SGGW

Publisher

MDPI AG

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3