Machine Learning-Driven Landslide Susceptibility Mapping in the Himalayan China–Pakistan Economic Corridor Region
Author:
Ullah Mohib1, Tang Bingzhe1ORCID, Huangfu Wenchao1, Yang Dongdong1, Wei Yingdong1, Qiu Haijun12ORCID
Affiliation:
1. Shaanxi Key Laboratory of Earth Surface and Environmental Carrying Capacity, College of Urban and Environmental Sciences, Northwest University, Xi’an 710127, China 2. Institute of Earth Surface System and Hazards, College of Urban and Environmental Sciences, Northwest University, Xi’an 710127, China
Abstract
The reliability of data-driven approaches in generating landslide susceptibility maps depends on data quality, analytical method selection, and sampling techniques. Selecting optimal datasets and determining the most effective analytical methods pose significant challenges. This study assesses the performance of seven machine learning classifiers in the Himalayan region of the China–Pakistan Economic Corridor, utilizing statistical techniques and validation metrics. Thirteen geo-environmental variables were analyzed, including topographic (8), land cover (1), hydrological (1), geological (2), and meteorological (1) factors. These variables were evaluated for multicollinearity, feature importance, and their influence on landslide incidences. Our findings indicate that Support Vector Machines and Logistic Regression were highly effective, particularly near fault zones and roads, due to their effectiveness in handling complex, non-linear terrain interactions. Conversely, Random Forest and Logistic Regression demonstrated variability in their results. Each model distinctly identified landslide susceptibility zones ranging from very low to very high risk. Significant conditioning variables such as elevation, rainfall, lithology, slope, and land use were identified, reflecting the unique geomorphological conditions of the Himalayas. Further analysis using the Variance Inflation Factor and Pearson correlation coefficient showed minimal multicollinearity among the variables. Moreover, evaluations of Area Under the Receiver Operating Characteristic Curve (AUC-ROC) values confirmed the strong predictive capabilities of the models, with the Random Forest Classifier performing exceptionally well, achieving an AUC of 0.96 and an F-Score of 0.86. This study shows the importance of model selection based on dataset characteristics to enhance decision-making and strategy effectiveness.
Funder
National Natural Science Foundation of China Key Research and Development Program of Shaanxi Second Tibetan Plateau Scientific Expedition and Research Program
Reference80 articles.
1. Xie, C., Huang, Y., Li, L., Li, T., and Xu, C. (2023). Detailed Inventory and Spatial Distribution Analysis of Rainfall-Induced Landslides in Jiexi County, Guangdong Province, China in August 2018. Sustainability, 15. 2. Chen, W., Chen, Y., Tsangaratos, P., Ilia, I., and Wang, X. (2020). Combining Evolutionary Algorithms and Machine Learning Models in Landslide Susceptibility Assessments. Remote Sens., 12. 3. Ullah, I., Aslam, B., Shah, S.H.I.A., Tariq, A., Qin, S., Majeed, M., and Havenith, H.-B. (2022). An Integrated Approach of Machine Learning, Remote Sensing, and GIS Data for the Landslide Susceptibility Mapping. Land, 11. 4. Zhuo, L., Huang, Y., Zheng, J., Cao, J., and Guo, D. (2023). Landslide Susceptibility Mapping in Guangdong Province, China, Using Random Forest Model and Considering Sample Type and Balance. Sustainability, 15. 5. Islam, F., Riaz, S., Ghaffar, B., Tariq, A., Shah, S.U., Nawaz, M., Hussain, M.L., Amin, N.U., Li, Q., and Lu, L. (2022). Landslide Susceptibility Mapping (LSM) of Swat District, Hindu Kush Himalayan Region of Pakistan, Using GIS-Based Bivariate Modeling. Front. Environ. Sci., 10.
|
|