Flood Risk Assessment Using Analytical Hierarchy Process (AHP) and Machine Learning Models
Step by step case study on Flood Risk Assessment using Analytical Hierarchy Process and Machine Learning. Applied in Togo.

A step‑by‑step case study on how Analytical Hierarchy Process (AHP) and machine learning models can support decision‑makers in understanding and managing flood risk. These models help quantify the likelihood and severity of damage to buildings, crops, and communities — enabling faster, more informed, and more compassionate responses during crises.
Introduction
Natural disasters represent one of the most pressing challenges facing societies at global, regional, and local scales. Around the world, extreme events such as floods, droughts, and wildfires have become more frequent and more intense as a result of ongoing climate change. While these hazards are driven by natural forces, human activities often amplify their impacts. Deforestation, land degradation, rapid population growth, unplanned urbanization, poor land‑use planning, and inadequate drainage infrastructure can all exacerbate the consequences of heavy rainfall and lead to severe flooding.
Togo, a small nation in West Africa, illustrates how vulnerable communities can be to such compounded risks. Flooding and drought are common across the country and have serious socio‑economic consequences for residents, ecosystems, and the economy. In recent years, excessive rainfall has washed away infrastructure and destroyed cultivated land. Monitoring, mapping, and predicting flood risk therefore play a critical role in preparing for natural disasters and designing effective mitigation strategies. Flood risk maps help planners understand where damage is most likely, enabling them to prioritize resources and interventions.

Monthly prediction of temperature and precipitation
The diagram above depicts the monthly prediction of temperature and precipitation across Togo. It underscores how seasonal variability and climate trends influence rainfall patterns, and therefore why decision‑makers must consider both natural and human factors when assessing flood risk.
Analytical Hierarchy Process (AHP)
To systematically identify and map areas with high flood risk in Togo, the team developed an Analytical Hierarchy Process (AHP) model. AHP is a multi‑criteria decision‑making approach that combines a range of conditioning factors such as drainage density, soil type, slope, precipitation, population density, Euclidean distance to rivers, and land use/land cover (LULC). By integrating these diverse datasets, the method produces hazard maps that highlight where flooding is most likely to occur and vulnerability maps that show where communities are most susceptible to damage. When combined, these maps yield a comprehensive picture of flood risk.
Hazard factors
The hazard map evaluates how environmental conditions contribute to the likelihood and intensity of flooding. Drainage density — the total length of streams and channels within a basin relative to its area — indicates how readily water accumulates; higher density usually means a greater probability of runoff. Precipitation, especially during intense rainfall periods, is a major driver of flooding. The slope of the terrain affects the speed at which surface water runs off; steep slopes produce rapid runoff and can increase flood danger. Finally, soil type influences infiltration: soils with a high clay content absorb water more slowly than sandy soils, resulting in more surface runoff and higher flood susceptibility.
Vulnerability factors
Where hazard describes the potential for flooding to occur, vulnerability reflects how exposed and sensitive communities are to its impacts. Proximity to main river channels or flow paths, measured using Euclidean distance, influences flood exposure. Land use and land cover reveal vegetation type, cultivation patterns, and human activities that determine environmental resilience. Population density captures the degree to which people live in flood‑prone areas; rapidly growing or informally developed zones may face heightened risk because of inadequate infrastructure and planning.
AHP methodology
The AHP method uses a hierarchical structure to break down the complex problem of flood risk into manageable steps. At the top level is the main objective: creating a flood risk map. This objective is supported by two criteria — the hazard map and the vulnerability map — and each criterion contains its own set of factors such as drainage density, soil type, slope, precipitation, population density, Euclidean distance to rivers, and land use/land cover. By building this hierarchy, the team could analyze the relative importance of each factor.
For each criterion, a pairwise comparison matrix was created. Experts compared every factor against every other factor using the Saaty scale (1 representing equal importance and 9 representing extreme importance). For example, when comparing drainage density (D) and soil type (ST), one might judge D to be moderately more important (score 3), while the reciprocal (1/3) reflects the opposite comparison. The following table summarizes the Saaty scale used for these judgements:
| Scale | Meaning |
| 1 | Equally important |
| 3 | Moderately important |
| 5 | Important |
| 7 | Very strongly important |
| 9 | Extremely important |
| 2,4,6,8 | Intermediate values between adjacent scales |
After constructing the pairwise matrix, the team calculated an eigenvector V_p for each factor by taking the geometric mean of each row and normalizing the result. This produced a set of preliminary weights reflecting the relative influence of drainage density (D), soil type (ST), slope (S), and precipitation (P). The eigenvector results are shown below:
| D | ST | S | P | V_p | |
| D | 1 | 3 | 1/3 | 1/5 | 0.67 |
| ST | 1/3 | 1 | 1/3 | 1/5 | 0.39 |
| S | 3 | 3 | 1 | 1/3 | 1.32 |
| P | 5 | 5 | 3 | 1 | 2.94 |
The preliminary weights were then normalized to obtain the weighting coefficients C_p. Dividing each V_p by the sum of all V_p values yielded the weights in the table below. Notice that their sum equals 1, as required by the AHP procedure.
| D | ST | S | P | V_p | C_p | |
| D | 1 | 3 | 1/3 | 1/5 | 0.67 | 0.13 |
| ST | 1/3 | 1 | 1/3 | 1/5 | 0.39 | 0.07 |
| S | 3 | 3 | 1 | 1/3 | 1.32 | 0.25 |
| P | 5 | 5 | 3 | 1 | 2.94 | 0.55 |
| Sum | 5.32 | 1 |
To ensure the subjective judgements were consistent, the consistency ratio (CR) was calculated. The maximum eigenvalue (λ_max) of the matrix was found to be 4.1975, leading to a consistency index (CI) of 0.066. Dividing by the random index (RI) value of 0.9 for a 4×4 matrix gave a CR of 0.073, which is below the acceptable 10 % threshold. Thus, the weights were considered reliable.

Figure 3 – AHP Pipeline
The AHP pipeline above illustrates the workflow — from data collection and preprocessing through pairwise comparisons, weight computation, and map generation. It underscores the systematic nature of the method, which begins with gathering relevant datasets and culminates in a combined flood risk map.
Data Collection
Creating accurate hazard and vulnerability maps requires a diverse set of spatial datasets. The team assembled the following sources to capture environmental and socio‑economic characteristics across Togo: a country boundary shapefile from DIVA GIS, digital elevation models (ALOS PULSAR and ALOS World 3D) to represent topography, land use and land cover information from the Copernicus Global Land Service, precipitation records from the University of California CHRS, population density from Facebook Data for Good, soil maps from the Food and Agriculture Organization of the United Nations, and river or stream network data from Stanford University. Together, these datasets formed a rich foundation for understanding how water flows through the landscape and where people and assets are located.

Maps of digital elevation, stream network and population density

Data collected from various sources
The illustrations above show examples of these input layers, including the digital elevation model, the stream network and population density maps, and a composite overview of the datasets collected. Each layer captures a different aspect of the physical or human environment, and together they allow the AHP model to account for both natural and socio‑economic drivers of flood risk.
Data Pre‑processing
Once the raw datasets were obtained, they were processed and standardized using GIS tools such as QGIS and ArcGIS. Slope maps were derived from the digital elevation model, providing insight into how quickly water will run off the land. Euclidean distance maps were computed from the river network to measure how close each location is to major channels and potential overflow points. Drainage density maps were calculated in a similar fashion, quantifying the concentration of streams in each catchment. All layers were then standardized and reclassified into comparable units so that they could be combined consistently in the AHP model. This preprocessing step ensured that differences in scale or measurement units did not bias the results.

Maps generated from collected data
The figure above illustrates some of these derived products, demonstrating how raw data were transformed into usable layers for analysis.
AHP Modelling
Creating the hierarchy
In the AHP framework, the flood risk problem was decomposed into a clear hierarchy. At level 0 sits the main objective: producing a flood risk map. Level 1 contains the two broad criteria — hazard and vulnerability — which capture the environmental and socio‑economic aspects of flooding. Level 2 lists the specific factors that influence each criterion, such as drainage density, soil type, slope, precipitation, population density, Euclidean distance to rivers, and land use/land cover. These factors were chosen based on both literature and expert judgement.

AHP hierarchy
The hierarchical diagram conveys how the main objective is broken down into criteria and sub‑criteria. By structuring the problem in this way, the team could systematically evaluate how each element contributes to flood risk.
Pairwise comparison and weighting
For each criterion, a pairwise comparison matrix was generated using the previously described Saaty scale. Subject‑matter experts assessed the relative importance of every pair of factors. The resulting matrices were then processed to calculate eigenvectors and normalized weights (C_p). Consistency checks ensured that the judgements were internally coherent; a consistency ratio of 0.073 indicated that the weights were acceptable.

Matrix multiplication illustrating the computation of weights
The graphic above shows the matrix multiplication used to derive the final weights. Multiplying the pairwise matrix by the eigenvector yields the A3 matrix; dividing each element by its corresponding weight results in the final weighting coefficients.
Generating the hazard map
Hazard refers to natural or human‑made phenomena that occur with an intensity capable of causing harm through stream overflow. By combining the conditioning factors — drainage density (D), soil type (ST), slope (S), and precipitation (P) — and applying their weights, the team calculated a hazard index using the formula:
[ ext{Hazard index} = 0.13 imes D + 0.07 imes ST + 0.25 imes S + 0.55 imes P]
Higher values of this index correspond to areas where the environment is most conducive to flooding. The resulting hazard map identifies regions that are likely to experience flood events by combining the influences of topography, hydrology, soil properties, and rainfall.

AHP workflow for generating hazard map
Generating the vulnerability map
Vulnerability represents the degree to which people and assets are exposed and sensitive to flood impacts. Flood vulnerability mapping entails assessing how susceptible an area is to inundation and damage. Following the same AHP process used for the hazard map, the team calculated the weights for population density (PD), land use/land cover (LULC), and Euclidean distance (ED) to derive a vulnerability index using the formula:
[ ext{Vulnerability index} = 0.26 imes PD + 0.64 imes LULC + 0.10 imes ED]

AHP workflow for generating vulnerability map
Flood risk map
The final flood risk map combines the hazard and vulnerability indices through multiplication. Areas with both high hazard and high vulnerability display the greatest flood risk, signalling where urgent action, planning, and protective measures are most needed. The formula used is straightforward:
[ ext{Flood Risk} = ext{Hazard Index} imes ext{Vulnerability Index}]

AHP workflow for generating flood risk map

Hazard, vulnerability and flood risk map
The sequence of visuals above summarizes how the hazard and vulnerability maps are combined into a single flood risk layer. The hazard map highlights environmental susceptibility, the vulnerability map indicates socio‑economic exposure, and their product yields a nuanced understanding of risk distribution.
Machine Learning Models
After constructing the flood risk maps, the team generated training and testing datasets using stratified random sampling. This approach ensured a representative sample of low, moderate, and high‑risk areas. Class imbalance was addressed, and outliers were removed to improve model performance. An automated machine learning library, MLjar, was employed to create an end‑to‑end machine learning pipeline. By leveraging AutoML, the team could systematically explore a range of algorithms and hyperparameters without manual tuning.
The following machine learning models were trained and compared:
- Linear regression model – a baseline algorithm that predicts flood risk as a continuous function of input variables;
- Decision tree – a tree‑based model that partitions the feature space into simple decision rules;
- Random forest – an ensemble of decision trees that improves robustness by averaging multiple models;
- XGBoost – an advanced boosting algorithm that iteratively enhances performance by focusing on residual errors;
- Neural network – a model composed of interconnected layers that can capture non‑linear relationships; and
- Ensemble model – a combination of several algorithms designed to leverage their complementary strengths.
Feature importance analysis provided insight into which factors most influenced model predictions, and a heatmap was generated to visualize these relationships.

Feature importance or heatmap for machine learning models
To validate each classification model, ROC curves (receiver operating characteristic curves) were plotted. An ROC curve shows how well a model distinguishes between classes across all possible thresholds. The curve plots the true positive rate against the false positive rate, illustrating the trade‑off between correctly identifying high‑risk areas and incorrectly flagging low‑risk locations.

ROC curves comparisons of Machine Learning Models
Beyond the core variables used in AHP, additional factors such as altitude, aspect, curvature, stream power index (SPI), topographic wetness index (TWI), sediment transport index (STI), topographic roughness index (TRI), geology, and surface runoff were considered in exploratory analyses. The team also produced confusion matrices to evaluate model performance. These matrices revealed that the ensemble and XGBoost models performed best, with high accuracy across risk categories and relatively few misclassifications.

Normalized confusion matrix for flood risk classification

Confusion matrix for ensemble and XGBoost models

Performance times for machine learning models
The table above compares the computation times of several algorithms, illustrating that the ensemble model achieved strong performance while also being efficient. Taken together, these results suggest that machine learning models can classify high‑risk areas at least as accurately as the AHP approach, while potentially requiring fewer input features.
Comparison between AHP and Machine Learning Models
To evaluate the practical utility of the two approaches, the team created flood risk maps using both the AHP model and the best‑performing machine learning algorithms. The machine learning model proved adept at distinguishing high‑risk areas, often producing smoother and more precise maps than the AHP technique. Although there were relatively few samples in the moderate and high‑risk categories, the automated approach still performed comparably to the expert‑driven AHP. Moreover, the machine learning model achieved similar or better results using roughly half the number of input features, demonstrating its efficiency.

Flood risk map created using machine learning model

Flood risk map created using machine learning model

Flood risk map created using AHP model
The juxtaposition of these two maps reveals how data‑driven models can complement expert‑driven methods. While AHP provides a transparent framework grounded in expert judgment, machine learning can exploit patterns in the data to refine risk estimates and identify subtle variations in hazard and vulnerability.
Conclusion
The AHP analysis underscored that precipitation is the most influential factor in determining flood hazard in Togo. Areas with intense rainfall — often depicted in red on the hazard map — are highly prone to flooding, while regions with low precipitation appear in green and face minimal threat. On the vulnerability side, land use and land cover received the highest weight because deforestation, agricultural practices, and urban expansion strongly affect how exposed communities are to flooding. When combined, the hazard and vulnerability maps highlighted precipitation, land use/land cover, and population density as the dominant drivers of flood risk.
These findings reinforce the need for stringent land‑use planning, improved drainage and discharge management, and targeted interventions in high‑risk zones. The risk maps could be further improved by incorporating additional variables such as flow accumulation and lithology. More broadly, there is ample opportunity to apply AHP modelling to other regions and to integrate machine learning with expert knowledge. For instance, regional AHP models could provide initial risk scores, while machine learning could extend those scores to surrounding areas without having to build a full AHP model. By harnessing both human expertise and automated analytics, decision‑makers can develop more resilient strategies to mitigate flood impacts and protect vulnerable communities.
If you want to build advanced AI solutions for climate resilience, disaster prediction or environmental risk assessment, connect with Omdena. Share your goals and let us see if we are a good fit.



