Affiliation:
1. Chang’an University
2. Chinese Academy of Sciences
3. Hohai University
Abstract
Abstract
Spatial prediction (SP) based on machine learning (ML) has been applied to soil water quality, air quality, marine environment, etc. However, there are still deficiencies in dealing with the problem of small samples. Normally, ML require large amounts of training samples in order to prevent overfitting. The data augmentation method of mixup and synthetic minority over-sampling technique (SMOTE) ignores the similarity of geographic information. Therefore, this paper proposes a modified upsampling method and combines it with the random forest spatial interpolation (RFSI) to deal with the small sample problem in geographical space. The modified unsampling mainly reflected in the following two aspects. Firstly, in the process of selecting nearest points, it is to select points with similar geographic information in some aspects of the category after classification. Secondly, the selected difference is the difference of each category. In order to verify the effectiveness of the proposed method, we select precipitation as the target factor and conduct a comparative experiment. The experimental results show that the combination of the modified upsampling method and RFSI effectively improves the accuracy of spatial prediction.
Publisher
Research Square Platform LLC