Affiliation:
1. Colorado School of Mines
Abstract
Machine-learning algorithms have long aided in geologic property prediction from well-log data, but are primarily used to classify lithology, facies, formation, and rock types. However, more detailed properties (e.g., porosity, grain size) that are important for evaluating hydrocarbon exploration and development activities, as well as subsurface geothermal, CO2 sequestration, and hydrological studies have not been a focus of machine-learning predictions. This study focuses on improving machine-learning regression-based workflows for quantitative geological property prediction (porosity, grain size, XRF geochemistry), using a robust dataset from the Dad Sandstone Member of the Lewis Shale in the Green River Basin, Wyoming. Twelve slabbed cores collected from wells targeting turbiditic sandstones and mudstones of the Dad Sandstone member provide 1212.2 ft. of well-log and core data to test the efficacy of five machine-learning models, ranging in complexity from multivariate linear regression to deep neural networks. Our results demonstrate that gradient-boosted decision-tree models (e.g., CatBoost, XGBoost) are flexible in terms of input data completeness, do not require scaled data, and are reliably accurate, with the lowest or second lowest root mean squared error (RMSE) for every test. Deep neural networks, while used commonly for these applications, never achieved lowest error for any of the testing. We also utilize newly collected XRF geochemistry and grain-size data to constrain spatiotemporal sediment routing, sand-mud partitioning, and paleo-oceanographic redox conditions in the Green River Basin. Test-train dataset splitting traditionally uses randomized inter-well data, but a blind well testing strategy is more applicable to most geoscience applications that aim to predict properties of new, unseen well locations. We find that using inter-well training datasets are more optimistic when applied to blind wells, with a median difference of 0.58 RMSE when predicting grain size in phi units. Using these data and results, we establish a baseline workflow for applying machine-learning regression algorithms to core-based reservoir properties from well-log and core-image data. We hope that our findings and open-source code and datasets released with this paper will serve as a baseline for further research to improve geological property prediction for sustainable earth-resource modeling.
Publisher
Society for Sedimentary Geology
Subject
General Earth and Planetary Sciences,General Environmental Science
Reference97 articles.
1. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems;M. Abadi;ArXiv,2016
2. K-means++: The advantages of careful seeding;D. Arthur;Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms,2007
3. Depositional Topography and Major Marine Environments, Late Cretaceous, Wyoming;D.O. Asquith;AAPG Bulletin,1970
4. Petroleum Potential of Deeper Lewis Washakie and Sand Wash Basins, Wyoming and Colorado;D.O. Asquith,1975
5. Support vector machine regression (SVR/LS-SVM)—an alternative to neural networks (ANN) for analytical chemistry? Comparison of nonlinear methods on near infrared (NIR) spectroscopy data;Roman M. Balabin;The Analyst,2011
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献