Affiliation:
1. Center for Agricultural Water Research in China, China Agricultural University, Beijing 100083, China
2. Ordos City River and Lake Protection Centre, Ordos 017010, China
3. Ordos Water and Drought Disaster Prevention Technology Centre, Ordos 017010, China
Abstract
Machine learning (ML)-based models are popular for complex physical system simulation and prediction. Lake is the important indicator in arid and semi-arid areas, and to achieve the proper management of the water resources in a lake basin, it is crucial to estimate and predict the lake dynamics, based on hydro-meteorological variations and anthropogenic disturbances. This task is particularly challenging in arid and semi-arid regions, where water scarcity poses a significant threat to human life. In this study, a typical arid area of China was selected as the study area, and the performances of eight widely used ML models (i.e., Bayesian Ridge (BR), K-Nearest Neighbor (KNN), Gradient Boosting Decision Tree (GBDT), Extra Trees (ET), Random Forest (RF), Adaptive Boosting (AB), Bootstrap aggregating (Bagging), eXtreme Gradient Boosting (XGB)) were evaluated in predicting lake area. Monthly lake area was determined by meteorological (precipitation, air temperature, Standardised Precipitation Evapotranspiration Index (SPEI)) and anthropogenic factors (ETc, NDVI, LUCC). Lake area determined by Landsat satellite image classification for 2000–2020 was analysed side-by-side with the Standardised Precipitation Evapotranspiration Index (SPEI) on 9 and 12-month time scales. With the evaluation of six input variables and eight ML algorithms, it was found that the RF models performed best when using the SPEI-9 index, with R2 = 0.88, RMSE = 1.37, LCCC = 0.95, and PRD = 1331.4 for the test samples. Furthermore, the performance of the ML model constructed with the 9-month time scale SPEI (SPEI-9) as an input variable (MLSPEI-9) depended on seasonal variations, with the average relative errors of up to 0.62 in spring and a minimum of 0.12 in summer. Overall, this study provides valuable insights into the effectiveness of different ML models for predicting lake area by demonstrating that the right inputs can lead to a remarkable increase in performance of up to 13.89%. These findings have important implications for future research on lake area prediction in arid zones and demonstrate the power of ML models in advancing scientific understanding of complex natural systems.
Funder
A special project entitled Science and Technology for the Development of Mongolia, Department of Science and Technology of Inner Mongolia
Subject
General Earth and Planetary Sciences