Affiliation:
1. Industrial University of Tyumen
Abstract
The algorithm of analog-to-digital conversion of primary geological and geophysical information (on the example of identification of rock lithotypes based on the text description of the physical core) is presented.As part of the work, a combination of three types of scientific research - prospecting, interdisciplinary and applied, in the formation of the initial base of qualitative data is implemented.Common algorithms for textual information classification and mechanism of initial data preprocessing using tokenization are described.The concept of text pattern recognition is implemented using artificial intelligence methods.For creation of the neural network model of textual geological and geophysical information recognition the Python programming language is used in combination with the convolutional neural network technologies for text classification (TextCNN), bi-directional long-shortterm memory networks (BiLSTM) and bi-directional coder representation networks (BERT).The stack of these technologies and the Python programming language, after developing and testing the basic version of the neural network model of qualitative information recognition, provided an acceptable level of performance of the algorithm of digital transformation of text data.The best result (the current version of neural network model is 1.0; more than 3000 examples for training and testing) was achieved when using the algorithm of text data recognition based on BERT with an accuracy on the validation network (Validation Accuracy) ~0.830173 (25th epoch), with Validation Loss ~0.244719, with Training Loss ~0.000984 and probability of recognition of the studied rock lithotypes more than 95 %.The mechanisms of code modification for further improvement of textual prediction accuracy based on the created neural network were determined.
Publisher
Industrial University of Tyumen