Why Hospital Survival Rates Can Mislead: Unveiling Simpson’s Paradox
Simpson’s Paradox shows how aggregated data can suggest one trend while each subgroup reveals the opposite, illustrated with hospital survival rates where overall A appears better than B, yet detailed analysis by severity flips the conclusion, highlighting the need to consider background variables in statistical interpretation.
Simpson's Paradox is a fascinating statistical phenomenon that indicates that even when each subgroup shows a trend, the combined data may show a completely opposite overall trend.
Hospital Mortality Rates
Overall survival data for the two hospitals are:
Hospital A: 900 survivors, 100 deaths, survival rate 90%.
Hospital B: 800 survivors, 200 deaths, survival rate 80%.
At first glance, Hospital A appears to perform better with a higher overall survival rate.
However, when the data are broken down by severity of illness, the picture changes.
Hospital A – Severe cases: 100 admissions, 30 survivors, 70 deaths (30% survival).
Hospital A – Non‑severe cases: 900 admissions, 870 survivors, 30 deaths (96.7% survival).
Hospital B – Severe cases: 400 admissions, 210 survivors, 190 deaths (52.5% survival).
Hospital B – Non‑severe cases: 600 admissions, 590 survivors, 10 deaths (98.3% survival).
From these subgroup figures, Hospital B has a higher survival rate than Hospital A for both severe and non‑severe patients.
Why Does This Happen?
Simpson's Paradox is a counter‑intuitive statistical effect that shows that when background variables are taken into account, the same data can lead to completely different conclusions.
This is not a universal rule, but it demonstrates that trends observed in individual groups cannot be directly extrapolated to the overall population, and vice versa.
The mathematical expression of Simpson's Paradox can be summarized in two situations:
Each group's ratio points in the same direction, but the overall ratio reverses.
If for every group we have a certain ratio relationship, yet when the groups are combined the overall ratio flips, the paradox occurs.
Each group's ratio points in opposite directions, yet the overall ratio aligns with one of them.
If one group shows one ratio and another group shows the opposite, the combined data may still match one group's direction or even the opposite, depending on the weighting.
Simpson's Paradox emphasizes the importance of considering background variables in statistical analysis. It reminds us that superficial ratios and trends can be misleading; a proper approach involves multidimensional analysis that accounts for all possible confounding factors.
When dealing with complex data, we should be vigilant about the potential impact of Simpson's Paradox, paying attention not only to overall trends but also to the relationships within each subset.
Author: Wang Haihua
References: Blyth, C. R. (1972). "On Simpson's Paradox and the Sure-Thing Principle". Journal of the American Statistical Association. Simpson, E. H. (1951). "The Interpretation of Interaction in Contingency Tables". Journal of the Royal Statistical Society. Xu, S. (2021). 大话机器智能: 一书看透AI的底层运行逻辑. Mechanical Industry Press.
Model Perspective
Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.