Advanced EOR screening methodology based on LightGBM and random forest: A classification problem with imbalanced data

Author:

Seyyedattar Masoud1,Afshar Majid2,Zendehboudi Sohrab1ORCID,Butt Stephen1

Affiliation:

1. Faculty of Engineering and Applied Science Memorial University St. John's Newfoundland Canada

2. School of Computer Science University of Windsor Windsor Ontario Canada

Abstract

AbstractIn an unstable oil market with volatile prices due to various natural and geopolitical factors, it is crucial for oil‐producing companies to enhance the value of their assets by improving the recovery factors of petroleum reservoirs. Primary recovery through natural depletion or artificial lift and secondary recovery using waterflooding and immiscible gas injection typically recover no more than 10%–40% of the available reserves. A significant portion of the hydrocarbons remain unproduced if enhanced oil recovery (EOR) methods are not implemented. EOR projects are extremely costly, complex, and usually have long lead times from the decision‐making and design phases to pilot and full‐field implementations. Therefore, oil and gas operator companies need reliable insights into the best possible EOR options from the early stages of any field development planning. Since screening potential EOR choices is the first step in deciding future production scenarios, a smart EOR screening tool can add significant value by streamlining the EOR decision‐making process. In this study, we developed an EOR screening tool based on two advanced machine learning classification algorithms, random forest and light gradient boosting machine (LightGBM). These tree‐based ensemble learning classifiers were trained on an extensive dataset of 1384 worldwide EOR implementations, encompassing various reservoir conditions and reservoir rock and fluid properties as the feature space, to predict the EOR type as the class label. Considering EOR screening as a classification problem, an essential aspect of model development would be addressing the data imbalance of EOR datasets. To tackle this issue, the adaptive synthetic (ADASYN) sampling method was used to reduce classification bias by oversampling the training sets to achieve uniform class distributions. We designed an iterative model development procedure in which the classifiers were trained and tested on various training and test subsets split by stratified random sampling. For each classifier, the classification results at each iteration were used to build the confusion matrix and calculate model evaluation metrics (accuracy, precision, recall, and F1–score), which were then averaged over all independent runs to provide a fair assessment of classification performance. Moreover, binary receiver operating characteristic (ROC) curves were used to evaluate the classifier predictions and improvements obtained by oversampling. The results showed that both random forest and LightGBM classifiers made accurate class predictions, with LightGBM achieving slightly better classification performance in each modelling scenario (with or without oversampling). In both cases, the oversampling of the training dataset resulted in significant improvement of the classifiers, as evidenced by higher values of the evaluation metrics, leading to considerably more accurate EOR type predictions; specifically, oversampling boosted the prediction accuracy of the random forest model from 78.3% to 89.5% and the LightGBM model from 77.5% to 90.2%. Additionally, feature importance rankings provided valuable insights into which input variables had the greatest impact on model development.

Publisher

Wiley

Reference133 articles.

1. A COMPREHENSIVE REVIEW ON FLUID AND ROCK CHARACTERIZATION OF OFFSHORE PETROLEUM RESERVOIRS: TESTS, EMPIRICAL AND THEORETICAL TOOLS

2. R.Teigland J.Kleppe presented at SPE/DOE Sympos. Improved Oil Recov. Tulsa Oklahoma USA 2006.

3. J. L.Dickson A.Leahy‐Dios P. L.Wylie presented at SPE Improv. Oil Recov. Sympos. Tulsa Oklahoma USA 2010.

4. Effect of operational parameters on SAGD performance in a dip heterogeneous fractured reservoir

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3