A novel framework for flood susceptibility assessment using hybrid analytic hierarchy process-based machine learning methods

Authors:
This study evaluates the effectiveness of the analytic hierarchy process (AHP) based on six machine learning models in predicting flood susceptibility in the Dwarakeswar river basin in Eastern India. Fifteen flood conditioning factors were employed as input predictors. The dataset underwent a series of pre-processing procedures, including conducting a statistical Pearson correlation, ordinary least squares (OLS), and multi-collinearity analysis, to identify the best flood-contributing factors. Additionally, the Information Gain Ratio (InGR) feature selection technique was utilized to assess the relevance of features. The accuracy of the models during the validation phases was assessed using various statistical metrics such as accuracy, kappa score, sensitivity, specificity, positive predictive value, negative predictive value, and the area under the receiver operating characteristic curve (AUC). Although all models demonstrated robust flood prediction abilities (AUC > 0.988), the AHP-Gradient Boosting Machine (GBM) model exhibited the highest performance (AUC = 0.996). This indicates that, among the models examined, the AHP-GBM model holds significant promise for evaluating flood-prone regions and facilitating effective planning and management of flood hazards. This model identified 12.68% and 5.14% of the study area as very high and high flood susceptibility zones, respectively. The SHapley Additive exPlanations (SHAP) analysis shows that the Modified Normalized Difference Water Index (MNDWI), rainfall, elevation, Normalized Difference Vegetation Index (NDVI), proximity to rivers, drainage density, and Terrain Ruggedness Indices (TRI) are the best influences on flood probability. Based on the climate projections from future Coupled Model Intercomparison Project Phase 6 (CMIP6) models (SSP2 4.5, SSP5 8.5), the southern region of the study area has been pinpointed as a hotspot for flooding vulnerability, with a susceptibility level classified as very high, encompassing 16.68% of the area.