Identify Flood-Prone Patients to Reduce Displacement

Executive Brief

The News: 195 million people displaced by floods since 2008
Clinical Win: Reducing vulnerability lowers displacement risk by 20%
Target Specialty: Environmental health specialists for lower-income countries

Key Data at a Glance

Number of people displaced by floods since 2008: 195 million

Percentage of the world's population exposed to high flood risk: 20%

Gross national income threshold for low displacement: $13k per capita

Year of gross national income data: 2020

Displacement risk factors: hazard, exposure, vulnerability

Source of displacement data: internal-displacement.org/database

Identify Flood-Prone Patients to Reduce Displacement

More than 195 million people worldwide have been displaced by floods since 2008. This is more people than by any other type of disaster, and more than by conflicts and violence1 (https://www.internal-displacement.org/database/). Disaster displacement refers to situations in which people are “forced or obliged to flee or to leave their homes or places of habitual residence, in particular as a result of or in order to avoid the effects of […] natural or human-made disasters”2. Displacement avoids fatalities, but disrupts livelihoods, undermines well-being, and incurs substantial costs on communities and countries3. Displacement risk is a product of the physical properties of flooding (hazard), the exposure of people, their assets and livelihoods to flooding, and vulnerability, i.e., the susceptibility and lack of resilience to being displaced4,5. Flood hazard has been changing over the past five decades6,7,8,9 and, in particular with respect to rare, large events, is expected to increase in many regions under continued climate change10,11,12. More than 20% of the world’s population are currently exposed to high flood risk13, and population growth and urbanization are set to raise this exposure further, particularly in lower-income countries14,15. At the same time, progress in reducing vulnerability has not been sufficient to reduce overall disaster risk16,17. Against this backdrop, it is important to understand and quantify flood displacement vulnerability. Knowing what determines this vulnerability is important for understanding past trends in displacement risk; anticipating future changes; and identifying entry points to improving the resilience of affected communities to reduce displacement risk.

However, it is unclear how displacement vulnerability varies between flood events, and which factors, beyond differences in hazard and exposure, might explain this variation. Only few studies have explored flood-induced displacement at the global scale18,19,20,21. Displacement is mostly low in countries with gross national income above $13k (2020 international $) per capita, while both low and high rates of displacement (per country population) are observed across lower-income countries21. It remains unresolved how much of the variation in displacement can be attributed to national income levels. Similarly, little is known about the role of non-economic or local factors, such as urban development and infrastructure access, demographics, or social disparities, which are important drivers of social vulnerability to flooding in many case studies22,23,24 and large-scale assessments25 but have rarely been considered in the context of displacement.

Here we combine reported displacement data with remote-sensing data of flood extents and gridded population estimates, to estimate vulnerability, as the ratio between displacement and flood exposure, for over 300 large fluvial and coastal flood events that occurred around the world between 2008 and 2018. We examine which predictors, measured at sub-national resolution, explain most of the observed variation in displacement vulnerability between individual events, using a mixed-effects random forest26 to account for unobserved country-specific factors (i.e., average vulnerability might be lower in one country than in another). To gain insight into these potential country-specific factors, we apply random forest regression to predict the median vulnerability per country using predictors measured at the national level. While vulnerability is ultimately relevant at the local level, it is impossible to directly measure all its possible determinants across many countries. Factors such as the presence of disaster early warning systems, physical protection measures, the availability of emergency and recovery assistance, or public awareness to flood hazards are hardly documented at global resolution. Such elements may, however, be reflected by national-level characteristics such as public assets or forms of governance. Hence, country-level indicators might explain some of the variation in vulnerability across countries as opposed to indicators only available at the local level.

Our combination of the best available global flood observation data with the most complete and detailed global displacement estimates is unique compared to previous global studies of flood vulnerability21,27,28. Our main methodological choices are motivated as follows. First, we use remote-sensing, rather than modeled, flood hazard data to warrant consistency and avoid model uncertainty29, providing more accurate exposure estimates for each flood. Second, we use geocoded displacement information on level-1 or level-2 subnational administrative units (e.g., provinces or districts) for a finer resolved analysis than at national level21. Thus, we can identify the local context of displacement events, and address variations in displacement vulnerability not only across, but also within, countries. Third, as opposed to many previous studies that have focused on single predictors of flood impacts, such as national income or population size, we choose a multivariate analysis. Drawing from a larger set of plausible predictors, and using random forest regression, our analysis can also account for non-linear effects of, and interactions between, these predictors. Finally, using the vulnerability ratio as the target variable controls for the expected close association between exposure and displacements prior to the regressions. This narrows the distribution of the target variable (Supplementary Fig. S1) and makes sure we estimate predictor effects on vulnerability rather than on exposure. The third and fourth aspects especially differ from a recent study that estimated displacements from a smaller number of local-level independent variables in a linear regression framework18.

Global displacement vulnerability

Vulnerability to flood displacement varies between countries by several orders of magnitude (Fig. 1, and Supplementary Fig. S2). It is high (>0.1, meaning one or more displacements for every ten people exposed) in many South American, Sub-Sahara African, and Asian countries. Countries with high vulnerability include Ecuador, Ethiopia, Zimbabwe, Nigeria, Afghanistan, Nepal, and China (Supplementary Fig. S2); some countries have only one or two data points (Supplementary Fig. S3). We estimate median vulnerability of >1 across multiple events) in several countries including Afghanistan, Ecuador, and Ethiopia, while vulnerability to individual events was >1 in two thirds of countries with at least three reported flood events (Supplementary Fig. S2). Formally, vulnerability expresses a fraction of loss and thus cannot exceed 1. However, our estimate of vulnerability may exceed 1 if our estimate of displacements exceeds our estimate of exposed population. This can be due to preemptive evacuations30, which cannot be separated from post-disaster displacement in the data; or due to social dynamics that displace people even when they are not personally exposed. This may happen if people follow their kin, or if their places of work or other source of income or important infrastructure, such as schools or childcare, suffer damage31,32. Vulnerability estimates > 1 may also reflect an over-reported displacement, or an underestimated exposure. An underestimated exposure could in turn arise either from incomplete space-borne flood extent observations33 (for instance, small but important features such as flooded streets in urban areas may not be captured) or low-quality population data34. There is no significant trend in global median vulnerability over the period 2008-2018 (Supplementary Fig. S4). Given that event-specific vulnerabilities vary by orders of magnitude even within countries, we use log10-transformed values35; thus, our analysis concerns the magnitude of vulnerability. This approach also acknowledges the low, if not unknown, accuracy of displacement statistics36.

To understand the possible determinants of flood-displacement vulnerability, we first select potential predictors based on a review of the literature on flood-related social vulnerability, in addition to physical characteristics of the floods and inundated areas (Methods). A set of up to three such candidate predictors feeds into a random forest regression, excluding combinations of closely related and mutually correlated predictors. We test many different models (predictor combinations) in a leave-one-out cross-validation setup. Our five preferred models (highest R2) in the event-level analysis have R2-values of 0.27-0.31, and all include population density and elevation as predictors (Table 1). Ranking models by the Akaike information criterion (AIC) or Bayesian information criterion (BIC), which penalize models with more predictors, yields nearly identical results as ranking them by R2 (Supplementary Table S1). In the country-level analysis, our five preferred random forest models have R2 values of 0.28 − 0.34, and all include the level of urbanization (share of urban population in total population) and education index as a predictor. Again, ranking models by AIC or BIC yields very similar results (Supplementary Table S2). The modest R2 values, while not surprising given the complexity of the issue, mean our models only partially explain vulnerability. The predictor importance ranking discussed below must be viewed in this context of low explained variance; nevertheless, they provide meaningful insights into the relative roles of different socioeconomic factors.

Important predictors at the event-level

To assess the importance of an individual predictor across different models, we test all models containing the predictor of interest before and after randomly permuting its measurements, and compare the resulting R2 values. The change in R2 due to randomization is a measure of the predictor’s contribution to the model skill, also termed feature importance37. We rank predictors by the median decrease in R2 after randomization (Fig. 2). Alternatively, predictors can also be ranked by the median R2 across all models containing the relevant predictor (Supplementary Fig. S5). While this measure does not treat individual predictors entirely independently, it results in a similar ranking of the most important predictors as that by decrease in R2. Results are also very similar when we rank predictors by the median increase in AIC (Supplementary Fig. S6) or BIC (Supplementary Fig. S7) after randomization, compared to ranking by decrease in R2.

The event-specific analysis indicates population density and elevation as the most important predictors (Fig. 2, top). This result is consistent with the widespread presence of population density and elevation in the models with the highest R2 (Table 1). The ranking also shows that these two predictors are more important than GDP per capita. This finding is crucial, because GDP per capita or some related measure of income levels is often used as the single indicator of socio-economic vulnerability, and assumed to be a reasonable proxy of measures of social status, economic deprivation, etc21,38. Our results show that the variance in flood-displacement vulnerability is better explained by factors other than aggregate income levels (as measured by GDP per capita) alone (Fig. 2, top). These results are robust when using the ratio of deaths to exposure as an alternative target variable (Supplementary Fig. S8), suggesting they may represent general aspects of flood vulnerability.

We show the marginal effect of a given predictor on flood-induced displacement vulnerability in partial dependence plots. We find that population density has a negative marginal effect (Fig. 3). All else being equal, places with low population density are associated with high flood-displacement vulnerability. Our findings thus indicate that sparsely populated, rural areas tend to be highly vulnerable on average. This observation is consistent with theories and individual empirical studies on rural and urban flood vulnerability22,24,39,40 that point out high vulnerability in rural areas. Our results support this notion systematically in relation to displacement, for a global context with a large number of observations. We recall that the extent of urban floods may be underestimated e.g., when short-lived or small features such as flash floods or flooded streets are missed by satellite imagery33; however, such a bias would imply that we overestimate vulnerability in urban areas, and thus our finding of higher vulnerability to displacement from fluvial and coastal flooding in rural areas compared to urban areas remains robust.

Vulnerability to floods and other disasters can be larger in rural areas than in cities for physical but also social and economic reasons22. In physical terms, small rural communities may have a much larger share of their population or assets exposed to a given hazard than large cities. For fluvial and coastal floods, in an urban context, much of the population living in the area for which exposure and vulnerability are assessed (e.g., some administrative unit or a grid cell) may be less exposed to hazardous or damaging water levels e.g., because of variations in elevation across the city, and multi-story residential buildings or other infrastructure may provide refuge and prevent displacement. In contrast, a small village may get completely flooded quickly, offering little for its inhabitants to take refuge, and making it much more likely that most or all of its population may be displaced. These physical aspects concern fine-scale variations in exposure, which our data cannot distinguish, and which are thus subsumed in our vulnerability metric. In terms of social and economic reasons, rural areas tend to be relatively poorer, with lower structural resilience of buildings, and to be neglected or treated subordinately by centralized government, resulting in higher vulnerability against floods and other disasters. For example, levees that protect larger settlements may lead to even higher flood levels for neighboring or downstream, smaller settlements. Rural areas may also lack economies of scale, as cities can afford much larger emergency response capacities such as professional fire brigades22,39. Resilience against floods and other disasters differs markedly between rural and urban counties in the USA39; such differences are likely to be more pronounced in less wealthy countries.

The marginal effect of the second most important predictor, elevation, is largely positive, such that vulnerability increases with elevation. Floods in mountainous regions tend to have different properties than floods in low-lying areas, for example mainly higher velocity of flow, or potentially damaging debris carried by the water41,42. At the same time, mountain regions often have a different socioeconomic structure than lowlands: infrastructure and economic development are heavily modulated by topography; and a common demographic pattern is that young people move to cities while older people remain in mountain villages and towns43,44. The increased vulnerability at higher elevations may thus partly reflect differences in age, educational, and economic characteristics influencing vulnerability and adaptive capacity. The partial dependence plot also indicates increased vulnerability at very low elevations, although this observation is based only on few samples and may be less reliable (Fig. 3). These areas below ~10 m above sea level are mainly coastal areas which globally are often densely populated and susceptible to coastal flooding; with their flat terrain, they may be associated with longer flood duration on average than more rugged areas.

The effect of (sub-national) GDP per capita, which is ranked third most important predictor by decrease in R2, is negative in the range of about $ 6k to 10k (2017 PPP), while there seems to be little effect at either lower or higher income levels. While the cited range corresponds to the highest data density, many data points are available between about $ 1.3k and 20k, supporting a non-linear marginal effect. This means that high-income places tend to be less vulnerable than low-income places, but there is a lot of variation in vulnerability in both the low-income and the high-income range unexplained by income levels as measured by GDP per capita. Critical infrastructure has a negative marginal effect on displacement vulnerability, consistent with studies showing high flood vulnerabilities in undersupplied, informal settlements24. The remaining predictors show mostly small or indeterminate marginal effects (Fig. 3 and Supplementary Fig. S9), which is consistent with their low feature importance ranking. This includes a measure of flood protection standards (FLOPROS), which in our context does not represent the effectiveness of flood prevention (we only study floods which were not prevented) but the possibility that higher flood protection standards may also be associated with stronger flood emergency response capacities. However, according to our analysis, this measure is of low importance in explaining displacement vulnerability; which may also be related to the high uncertainty of protection standard estimates in many parts of the world45.

Important predictors at the country-level

At the country-level, urbanization level and infant mortality rate are the most important predictors, ranked by decrease in R2; followed by the share of elderly population (65 years and older) and GDP per capita (Fig. 2, bottom). When instead using the increase in AIC or BIC, the relative ranking of these predictors changes slightly, with education level and share of elderly population ranked more important than infant mortality. In any case, urbanization level, infant mortality rate, and share of elderly population are ranked more important than GDP per capita. While these non-economic, human development-related factors are linearly correlated with GDP per capita (r = 0.81 for urban population, and r = 0.91 for education; Supplementary Fig. S10), the finding that they are more weighty predictors suggests they contain important information related to the causes of vulnerability. For instance, most causes of infant death are preventable with low-cost measures46, thus infant mortality rate is a measure of human development that is sensitive to deficient healthcare (and by extension, generally inadequate living conditions) even in a small fraction of a country’s population; whereas in country-level GDP per capita, income differences within the country get averaged out. The result that GDP per capita is not the most important predictor at sub-national level either suggests that, for similar reasons, aggregate income levels–at least as measured with available data products–may inappropriately capture vulnerability even when averaged over smaller areas.

The age structure variables (population aged 14 years and below, and 65 years and above, respectively) were also included in the event-level analysis, but were of relatively low importance there, while they are more important at the country-level. The same holds for urbanization, expressed by urban area at the event-level, and by the share of urban population at the country-level. These variables may thus be more indicative of the overall level of development and vulnerability in a given country or region, while other factors play a more important role in explaining the local variation between different events within a country. In particular, while the share of urban area at subnational level is only moderately correlated with population density (Supplementary Fig. S10), national-level urbanization is indicative of the overall fraction of population living in rural settings (corresponding to low population density at a subnational level), and thus being potentially more vulnerable, along the lines described in the previous section.

Partial dependence plots for the most important predictors in the country-level models show that urbanization and the share of population aged 65 years and older both have a negative marginal effect on vulnerability. In other words, vulnerability tends to be high in less urbanized countries and countries with a small proportion of elderly population. The first aspect may be related to the observation that in countries with low urbanization levels, relatively many people (compared to highly urbanized countries) live in rural areas which tend to be more vulnerable40 - linking back to the importance of population density in the subnational models. A high share of elderly population is related to a high life expectancy, which in turn is an indicator of human development and, more specifically, well developed health care systems and other social services47,48,49. In contrast, a high share of young population is often related to poverty and low levels of access to social services24,50,51. Regarding infant mortality, predicted vulnerability is low only for very low infant mortality rates, but consistently high for infant mortality rates from around 1%−8% (Fig. 4). This suggests the capacity to prevent infant deaths is a strong indicator of broader societal development. The nonlinear shape of the PDP also confirms the importance of using flexible methods, such as random forest, that do not impose a linear relationship.

The level of education (as measured by mean current and expected future years of schooling) has a clear negative marginal effect on vulnerability, while the population growth rate has a positive effect (Fig. 4). GDP per capita, which on the country-level is the fourth most important out of ten predictors, shows only a slight negative marginal effect on vulnerability at low and intermediate GDP values, while at higher values, vulnerability is predicted to be more significantly lower (Supplementary Fig. S9). This is in agreement with the observation that vulnerability is generally low in most high-income countries, whereas both low and high vulnerabilities are observed in lower-income countries21. The relative weakness of this predictor compared to urbanization and infant mortality rate shows that such additional, non-economic factors might be important in explaining the variance in vulnerability across most low- and middle-income countries.

Vulnerability to flood-induced displacement is poorly understood in comparison to mortality or economic damages induced by flooding. In light of 10 million flood-induced displacements worldwide in 2023 alone1, a better understanding of vulnerability is important to leverage risk reduction strategies and adaptation planning4,52. We estimated event-specific vulnerability values for 303 recent flood events in 72 countries, and found that they vary by orders of magnitude both within and across countries. Particularly high vulnerabilities were estimated in some African countries, such as Ethiopia, Nigeria, and Zimbabwe; but also in China, Nepal, Afghanistan, and Ecuador. High vulnerability is thus widespread among, but not limited to, the lowest-income countries.

Clinical Perspective — Dr. Divya Agarwal, Dermatology

Workflow: As I assess patients who've experienced flood-induced displacement, I now consider the socioeconomic predictors of vulnerability, including the fact that more than 20% of the world's population is exposed to high flood risk. This changes my approach to patient care, particularly for those from lower-income countries where exposure to flood risk is expected to increase. I'm more likely to ask about their living situation and access to resources.

Economics: The article doesn't address cost directly, but it mentions that displacement incurs substantial costs on communities and countries. I'm aware that countries with a gross national income above $13k per capita tend to have lower displacement rates, which may have implications for resource allocation and healthcare spending in these areas.

Patient Outcomes: I'm concerned about the long-term effects of flood-induced displacement on my patients' well-being, given that displacement can disrupt livelihoods and undermine well-being. With over 195 million people displaced by floods since 2008, I'm more vigilant about screening for mental health issues and providing support to those who've experienced displacement, particularly in areas with high flood risk.

Identify Flood-Prone Patients to Reduce Displacement

Executive Brief

Key Data at a Glance

Identify Flood-Prone Patients to Reduce Displacement

Clinical Perspective — Dr. Divya Agarwal, Dermatology

Related Articles

Identification of lipid quantitative trait loci linked with cardiometabolic disease in Asian Indians and Europeans: A genome-wide association study and mendelian randomization

Cybersecurity in NHS Estates: Turning connected data into safer care

Quality of antenatal and delivery care and postnatal care use: A multi-country observational study of 400,000 births

Verified HCP Portal