Fundamentals 6 min read

Why Identical Statistics Can Hide Very Different Data: The Lesson of Anscombe’s Quartet

Anscombe’s Quartet shows that four data sets can share identical means, variances, regression lines and correlation coefficients yet display completely different scatter‑plot shapes, highlighting why visualisation is crucial and why relying only on summary statistics can mislead analysts.

Model Perspective

Sep 16, 2024

Anscombe’s Quartet

Anscombe’s Quartet, introduced by statistician Francis Anscombe in 1973, consists of four distinct 2‑dimensional data sets, each containing 11 (x, y) pairs. Although their statistical summaries—means, variances, regression line y = 3 + 0.5x, and correlation coefficient ≈0.67—are virtually identical, their visual patterns differ dramatically.

Statistical characteristics shared by the four sets:

Mean : average x = 9, average y = 7.5.

Variance : x variance ≈ 11, y variance ≈ 4.1.

Linear regression : same regression equation y = 3 + 0.5x with r² ≈ 0.67.

Plotting Reveals the Truth

Scatter‑plot visualizations show distinct shapes:

Dataset 1: points lie close to a straight line, a typical linear distribution.

Dataset 2: despite the same regression result, points follow a clear curved pattern, exposing a non‑linear relationship.

Dataset 3: most points align on a line but one obvious outlier heavily influences the regression.

Dataset 4: all points share the same x value, offering virtually no horizontal variation; the regression line is misleading because a single special point forces the same equation as the other sets.

Takeaway

The quartet demonstrates that relying solely on summary statistics such as means, variances, or correlation coefficients can be deceptive. Visual inspection is essential to uncover underlying patterns, outliers, or non‑linear relationships that numbers alone may hide. In practice, always complement statistical analysis with appropriate visualizations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

outliers regression analysis Anscombe's Quartet statistical pitfalls

Written by

Model Perspective

Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.