How to Perform One-Way ANOVA in Python: A Step-by-Step Guide
This article explains the concept of one‑way ANOVA, walks through a real‑world example comparing four manufacturing processes, and demonstrates how to conduct the analysis in Python using statsmodels, interpreting the resulting F‑statistic and p‑value to assess significance.
One‑Way ANOVA
In a single‑factor experiment only factor A varies while all other conditions are kept constant. The goal is to infer whether different levels of A produce a statistically significant difference in the response variable by testing equality of the means of several independent normal populations.
Example
Four manufacturing processes (A1–A4) are used to produce light bulbs. The lifetimes of bulbs from each process are measured, producing the data shown below.
A1: 1620, 1670, 1700, 1750, 1800
A2: 1580, 1600, 1640, 1720
A3: 1460, 1540, 1620, 1680
A4: 1500, 1550, 1610The data constitute four independent samples; we test the null hypothesis that the four population means are equal.
Using the decomposition of total sum of squares into between‑group and within‑group components, the ANOVA F‑statistic is computed. If the p‑value is below the chosen significance level (e.g., 0.05), the null hypothesis is rejected, indicating that factor A has a significant effect.
In practice the calculations are often performed with software. The following Python code uses statsmodels to carry out a one‑way ANOVA on the data.
<code>import numpy as np
import statsmodels.api as sm
y = np.array([1620,1670,1700,1750,1800,1580,1600,1640,1720,1460,1540,1620,1680,1500,1550,1610])
x = np.hstack([np.ones(5), np.full(4,2), np.full(4,3), np.full(3,4)])
d = {'x': x, 'y': y}
model = sm.formula.ols("y~C(x)", d).fit()
anova_res = sm.stats.anova_lm(model)
print(anova_res)
</code>The output is:
df sum_sq mean_sq F PR(>F)
C(x) 3 60153.3 20051.1 3.72774 0.0420037
Residual 12 64546.7 5378.89 NaN NaNSince the p‑value (0.042) is less than 0.05, we reject the null hypothesis and conclude that the manufacturing process has a statistically significant impact on bulb lifetime.
Model Perspective
Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.