Affiliation:
1. Baker Hughes, Celle, Lower Saxony, Germany
Abstract
Abstract
The conventional approach to fluid characterization using partial least squares (PLS) is considered a benchmark in chemometric fluid analysis. Complementary, convolutional neural networks (CNN) have been shown to provide comparable discrimination capabilities. In a comparative study, the performance for quantitative characterization of downhole fluids using near-infrared (NIR) spectra has been evaluated. Both methods are used to predict the fluid composition in fractions of water, gas, oil, and mud. PLS is a statistical technique designed to model the relationship between two sets of variables, in this case between the spectrum and the composition. It relies on the representation of the variables in a multidimensional latent space. Usually, the inference consists of three steps. First, the input (spectrum) is linearly projected into the latent space. Second, the output is calculated in the latent space. Finally, the composition is computed as a linear transformation of the latent output. Instead of using PLS for end-to-end inference, only its first step has been used for feature extraction. By using the first latent dimension for each component, features were obtained that can be conveniently associated with water, gas and oil respectively. These features are then used together with the constant baseline in a multinomial logistic regression to obtain fractional components of the present fluid types in the NIR spectra. The baseline is primarily needed for mud detection. In parallel, several CNN models were trained for fluid characterization based on NIR spectra on processed and raw data. Hyper-parameter optimization of the CNN's is performed using a tree structured Parzen estimator to obtain a best trial configuration. Scheduling of the optimization loop yielded improved inference results. Quantitative comparison of the PLS and CNN models was performed using a k-fold approach. This allows for a direct comparison of the methods performance given as input spectra of pure and mixed fluids. Both methods show high accuracy when predicting pure components. The root mean square error (RMSE) is consistently larger for PLS. The CNN models generally show larger variance in the prediction for mud, with minor fractions of water, gas and oil being inferred. A quantitative comparison of two methods in chemometric fluid analysis shows an overall improvement of predictive power for a set of deep CNN in respect to the PLS approach. Improved inference is achieved using raw NIR spectral data. This is particularly interesting as no further pre-processing of the spectra is required, thereby minimizing porting efforts in the development of embedded applications.
Reference15 articles.
1. Partial least squares regression and projection on latent structure regression (PLS Regression);Abdi;Wiley interdisciplinary reviews: computational statistics,2010
2. Optuna: A next-generation hyperparameter optimization framework;Akiba;In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining,2019
3. Algorithms for hyper-parameter optimization;Bergstra;Advances in neural information processing systems,2011
4. Data augmentation of spectral data for convolutional neural network (CNN) based deep chemometrics;Bjerrum;arXiv preprint arXiv:1710.01927,2017
5. Comparison of augmentation and pre-processing for deep learning and chemometric classification of infrared spectra;Blazhko;Chemometrics and Intelligent Laboratory Systems,2021