Effects of non-landslide sampling strategies on machine learning models in landslide susceptibility mapping.

Sci Rep

Wuhan Tianjihang Information Technology Co., Ltd., Wuhan, 430074, China.

Published: March 2024

This study aims to explore the effects of different non-landslide sampling strategies on machine learning models in landslide susceptibility mapping. Non-landslide samples are inherently uncertain, and the selection of non-landslide samples may suffer from issues such as noisy or insufficient regional representations, which can affect the accuracy of the results. In this study, a positive-unlabeled (PU) bagging semi-supervised learning method was introduced for non-landslide sample selection. In addition, buffer control sampling (BCS) and K-means (KM) clustering were applied for comparative analysis. Based on landslide data from Qiaojia County, Yunnan Province, China, collected in 2014, three machine learning models, namely, random forest, support vector machine, and CatBoost, were used for landslide susceptibility mapping. The results show that the quality of samples selected using different non-landslide sampling strategies varies significantly. Overall, the quality of non-landslide samples selected using the PU bagging method is superior, and this method performs best when combined with CatBoost for predicting (AUC = 0.897) landslides in very high and high susceptibility zones (82.14%). Additionally, the KM results indicated overfitting, displaying high accuracy for validation but poor statistical outcomes for zoning. The BCS results were the worst.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10965908PMC
http://dx.doi.org/10.1038/s41598-024-57964-5DOI Listing

Publication Analysis

Top Keywords

non-landslide sampling
12
sampling strategies
12
machine learning
12
learning models
12
landslide susceptibility
12
susceptibility mapping
12
non-landslide samples
12
effects non-landslide
8
strategies machine
8
models landslide
8

Similar Publications

Developing effective strategies to predict areas susceptible to landslides and reducing risk is vital. This involves using ensemble methods to meet the precise prediction and addressing challenges like data limitation. Recent studies have highlighted the potential of using ensemble methods to enhance the prediction of landslide susceptibility maps (LSM).

View Article and Find Full Text PDF
Article Synopsis
  • The study focuses on improving landslide susceptibility prediction by integrating 'landslide priors' (previous knowledge about landslides) with a deep learning model to enhance its effectiveness and adaptability.
  • It employs techniques like a variational autoencoder to clarify input features and develops a specialized loss function that incorporates physical constraints related to landslides.
  • The results show that the combined model outperforms traditional data-driven methods in various accuracy metrics and emphasizes the significance of factors such as 'slope' and 'rainfall' in predicting landslide occurrences.
View Article and Find Full Text PDF

Landslide susceptibility assessment (LSA) is fundamental for managing landslide geological disasters. This study presents a deep learning approach (DNN-MSFM) designed to enhance LSA modeling, particularly addressing limitations caused by the unbalanced distribution of data samples in applied datasets. DNN-MSFM approach combines a deep neural network (DNN) and a mean squared false misclassification loss function (MSFM) to handle unbalanced samples from the algorithmic perspective.

View Article and Find Full Text PDF

Epistemic uncertainty in data-driven landslide susceptibility assessment often tends to be increased by the limited accuracy of an individual model, as well as uncertainties associated with the selection of non-landslide samples. To address these issues, this paper centers on the landslide disaster in Ji'an City, China, and proposes a heterogeneous ensemble learning method incorporating frequency ratio (FR) and semi-supervised sample expansion. Based on the superimposed results of 12 environmental factor frequency ratios (FFR), non-landslide samples were selected and input into light gradient boosting machine (LightGBM), random forest (RF), and convolutional neural network (CNN) models for prediction along with historical landslide samples.

View Article and Find Full Text PDF

This study aims to explore the effects of different non-landslide sampling strategies on machine learning models in landslide susceptibility mapping. Non-landslide samples are inherently uncertain, and the selection of non-landslide samples may suffer from issues such as noisy or insufficient regional representations, which can affect the accuracy of the results. In this study, a positive-unlabeled (PU) bagging semi-supervised learning method was introduced for non-landslide sample selection.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!