Fundamentals 21 min read

Comparing Distributions Between Groups: Visualization and Statistical Methods in Python

This article demonstrates how to compare the distribution of a variable across treatment and control groups using Python, covering data simulation, visual techniques such as boxplots, histograms, KDE, CDF, QQ and ridgeline plots, and statistical tests including t‑test, SMD, Mann‑Whitney, permutation, chi‑square, KS and ANOVA.

Python Programming Learning Circle

Feb 24, 2023

Comparing Distributions Between Groups: Visualization and Statistical Methods in Python

Comparing the distribution of a variable between different groups is a common problem in data science, especially when evaluating the causal effect of a strategy (e.g., a user‑experience feature, an advertising campaign, a drug) via randomized controlled trials (A/B tests). Randomization ensures that the only systematic difference between the control and treatment groups is the experimental intervention.

Data Simulation

We simulate a dataset of 1,000 individuals with gender, age, and weekly income, randomly assigning each to either a treatment arm (four variations) or a control arm.

from src.utils import *
from src.dgp import dgp_rnd_assignment

df = dgp_rnd_assignment().generate_data()
print(df.head())

Visualization Methods

Boxplot : Shows median, Q1, Q3, and outliers, giving a compact summary of the income distribution for each group.

sns.boxplot(data=df, x='Group', y='Income')
plt.title("Boxplot")

Histogram : Bins the income values; using stat='density' and common_norm=False makes the histograms comparable despite different sample sizes.

sns.histplot(data=df, x='Income', hue='Group', bins=50, stat='density', common_norm=False)
plt.title("Density Histogram")

Kernel Density Estimate (KDE) : Provides a smooth estimate of the income distribution.

sns.kdeplot(x='Income', data=df, hue='Group', common_norm=False)
plt.title("Kernel Density Function")

Cumulative Distribution Function (CDF) : Plots the cumulative proportion of observations up to each income value, avoiding arbitrary bin choices.

sns.histplot(x='Income', data=df, hue='Group', bins=len(df), stat='density', element='step', fill=False, cumulative=True, common_norm=False)
plt.title("Cumulative distribution function")

QQ Plot : Compares quantiles of the two groups; deviation from the 45° line indicates differences in distribution shape.

income = df['Income'].values
income_t = df.loc[df.Group=='treatment', 'Income'].values
income_c = df.loc[df.Group=='control', 'Income'].values

df_pct = pd.DataFrame()
df_pct['q_treatment'] = np.percentile(income_t, range(100))
df_pct['q_control'] = np.percentile(income_c, range(100))

plt.figure(figsize=(8,8))
plt.scatter(x='q_control', y='q_treatment', data=df_pct, label='Actual fit')
plt.plot([0,1], [0,1], color='r', label='Line of perfect fit')
plt.xlabel('Quantile of income, control group')
plt.ylabel('Quantile of income, treatment group')
plt.title('QQ plot')

Statistical Tests

Student t‑test : Tests equality of means.

from scipy.stats import ttest_ind
stat, p_value = ttest_ind(income_c, income_t)
print(f"t-test: statistic={stat:.4f}, p-value={p_value:.4f}")

Standardized Mean Difference (SMD) : Provides a scale‑free measure of covariate imbalance.

from causalml.match import create_table_one
df['treatment'] = df['Group'] == 'treatment'
create_table_one(df, 'treatment', ['Gender', 'Age', 'Income'])

Mann‑Whitney U test : Non‑parametric test of median differences.

from scipy.stats import mannwhitneyu
stat, p_value = mannwhitneyu(income_t, income_c)
print(f"Mann–Whitney U Test: statistic={stat:.4f}, p-value={p_value:.4f}")

Permutation test : Randomly shuffles group labels to build a null distribution of the mean difference.

sample_stat = np.mean(income_t) - np.mean(income_c)
stats = np.zeros(1000)
for k in range(1000):
    labels = np.random.permutation(df['Group'] == 'treatment').values
    stats[k] = np.mean(income[labels]) - np.mean(income[labels==False])
 p_value = np.mean(stats > sample_stat)
print(f"Permutation test: p-value={p_value:.4f}")

Chi‑square test : Compares observed and expected frequencies across income bins derived from the control group.

from scipy.stats import chisquare
stat, p_value = chisquare(df_bins['income_t_observed'], df_bins['income_t_expected'])
print(f"Chi-squared Test: statistic={stat:.4f}, p-value={p_value:.4f}")

Kolmogorov‑Smirnov test : Measures the maximum absolute difference between the two CDFs.

from scipy.stats import kstest
stat, p_value = kstest(income_t, income_c)
print(f"Kolmogorov‑Smirnov Test: statistic={stat:.4f}, p-value={p_value:.4f}")

ANOVA (F‑test) : Extends comparison to more than two groups by testing variance differences.

from scipy.stats import f_oneway
income_groups = [df.loc[df['Arm']==arm, 'Income'].values for arm in df['Arm'].dropna().unique()]
stat, p_value = f_oneway(*income_groups)
print(f"F Test: statistic={stat:.4f}, p-value={p_value:.4f}")

Multi‑Group Visualisations

Boxplots, violin plots, and ridgeline plots (via joypy) can display income distributions for several experimental arms simultaneously, revealing trends such as increasing average income with higher arm numbers.

Conclusion

The article presents a suite of visual and statistical tools for comparing one variable across two or more groups, highlighting when each method is appropriate. Visualisations give intuitive insight, while formal hypothesis tests quantify the magnitude and statistical significance of observed differences, which is essential for reliable causal inference.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python A/B testing distribution comparison Statistical Tests

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.