Abstract
The accuracy of coal mine water inrush prediction models is affected mainly by the small number of samples and difficulty in feature extraction. In this paper, a new data augmentation water inrush prediction method is proposed. This method uses a natural neighbors theory and mutual information dropout sparse autoencoder -improved SMOTE to augment and predict the risk of water inrush in coal mines. By learning water intrusion features through the autoencoder, we can achieve better separation between classes and weaken the influence of data overlap between classes in the original sample. Then, the natural neighbors search algorithm is used to determine the intrinsic neighbor relationships between samples, remove outliers and noise samples, and use different oversampling methods for borderline samples and center samples in the minority class. Synthetic samples are generated in the feature space, mapped back to the original space and merged with the original samples to form an expanded water inrush dataset. Finally, the effectiveness of the proposed method is confirmed by comparing the measured water inrush data and prediction model results in typical mining areas in North China. The results from this study can be used to more accurately analyze the characteristics of water inrush accidents, improve the accuracy of water inrush accident prediction, and promote the application of machine learning in water inrush prediction.