Five Quick and Easy Data Visualization Techniques in Python with Matplotlib
This article introduces five essential Matplotlib chart types—scatter plot, line plot, histogram, bar plot, and box plot—explaining their strengths, typical use‑cases, and providing ready‑to‑run Python functions for each visualization.
Data visualization is a crucial part of a data scientist's workflow, especially during exploratory analysis of large or high‑dimensional datasets. This article presents five fundamental Matplotlib chart types, discusses their advantages and disadvantages, and supplies reusable Python functions for rapid plotting.
Scatter Plot
Scatter plots reveal the relationship between two variables and can encode a third variable via point size or color grouping. The following function creates a customizable scatter plot.
<code>import matplotlib.pyplot as
import numpy as
def scatterplot(x_data, y_data, x_label="", y_label="", title="", color = "r", yscale_log=False):
# Create the plot object
_, ax = plt.subplots()
# Plot the data, set the size (s), color and transparency (alpha) of the points
ax.scatter(x_data, y_data, s = 10, color = color, alpha = 0.75)
if yscale_log == True:
ax.set_yscale('log')
# Label the axes and provide a title
ax.set_title(title)
ax.set_xlabel(x_label)
ax.set_ylabel(y_label)
</code>Line Plot
Line plots are ideal for visualizing strong covariance between two variables over a range, summarizing trends more clearly than dense scatter plots. The function below mirrors the scatter plot structure with minor adjustments for line rendering.
<code>def lineplot(x_data, y_data, x_label="", y_label="", title=""):
# Create the plot object
_, ax = plt.subplots()
# Plot the best fit line, set the linewidth (lw), color and transparency (alpha) of the line
ax.plot(x_data, y_data, lw = 2, color = '#539caf', alpha = 1)
# Label the axes and provide a title
ax.set_title(title)
ax.set_xlabel(x_label)
ax.set_ylabel(y_label)
</code>Histogram
Histograms display the distribution of a single variable, showing frequency across bins. The function allows control over the number of bins and whether the histogram is cumulative.
<code>def histogram(data, n_bins, cumulative=False, x_label = "", y_label = "", title = ""):
_, ax = plt.subplots()
ax.hist(data, n_bins = n_bins, cumulative = cumulative, color = '#539caf')
ax.set_ylabel(y_label)
ax.set_xlabel(x_label)
ax.set_title(title)
</code>Overlaying multiple histograms with varying transparency enables direct comparison of different distributions.
<code># Overlay 2 histograms to compare them
def overlaid_histogram(data1, data2, n_bins = 0, data1_name="", data1_color="#539caf", data2_name="", data2_color="#7663b0", x_label="", y_label="", title=""):
max_nbins = 10
data_range = [min(min(data1), min(data2)), max(max(data1), max(data2))]
binwidth = (data_range[1] - data_range[0]) / max_nbins
if n_bins == 0:
bins = np.arange(data_range[0], data_range[1] + binwidth, binwidth)
else:
bins = n_bins
_, ax = plt.subplots()
ax.hist(data1, bins = bins, color = data1_color, alpha = 1, label = data1_name)
ax.hist(data2, bins = bins, color = data2_color, alpha = 0.75, label = data2_name)
ax.set_ylabel(y_label)
ax.set_xlabel(x_label)
ax.set_title(title)
ax.legend(loc = 'best')
</code>Bar Plot
Bar plots are effective for visualizing categorical data with a limited number of categories. Three variants are covered: regular, grouped, and stacked bar plots.
<code>def barplot(x_data, y_data, error_data, x_label="", y_label="", title=""):
_, ax = plt.subplots()
# Draw bars, position them in the center of the tick mark on the x-axis
ax.bar(x_data, y_data, color = '#539caf', align = 'center')
# Draw error bars to show standard deviation, set ls to 'none' to remove line between points
ax.errorbar(x_data, y_data, yerr = error_data, color = '#297083', ls = 'none', lw = 2, capthick = 2)
ax.set_ylabel(y_label)
ax.set_xlabel(x_label)
ax.set_title(title)
</code> <code>def stackedbarplot(x_data, y_data_list, colors, y_data_names="", x_label="", y_label="", title=""):
_, ax = plt.subplots()
for i in range(len(y_data_list)):
if i == 0:
ax.bar(x_data, y_data_list[i], color = colors[i], align = 'center', label = y_data_names[i])
else:
ax.bar(x_data, y_data_list[i], color = colors[i], bottom = y_data_list[i - 1], align = 'center', label = y_data_names[i])
ax.set_ylabel(y_label)
ax.set_xlabel(x_label)
ax.set_title(title)
ax.legend(loc = 'upper right')
</code> <code>def groupedbarplot(x_data, y_data_list, colors, y_data_names="", x_label="", y_label="", title=""):
_, ax = plt.subplots()
total_width = 0.8
ind_width = total_width / len(y_data_list)
alteration = np.arange(-(total_width/2), total_width/2, ind_width)
for i in range(len(y_data_list)):
ax.bar(x_data + alteration[i], y_data_list[i], color = colors[i], label = y_data_names[i], width = ind_width)
ax.set_ylabel(y_label)
ax.set_xlabel(x_label)
ax.set_title(title)
ax.legend(loc = 'upper right')
</code>Box Plot
Box plots provide a compact summary of a variable's distribution, showing median, quartiles, and potential outliers. The function below creates a customizable box plot.
<code>def boxplot(x_data, y_data, base_color="#539caf", median_color="#297083", x_label="", y_label="", title=""):
_, ax = plt.subplots()
ax.boxplot(y_data,
patch_artist = True,
medianprops = {'color': median_color},
boxprops = {'color': base_color, 'facecolor': base_color},
whiskerprops = {'color': base_color},
capprops = {'color': base_color})
ax.set_xticklabels(x_data)
ax.set_ylabel(y_label)
ax.set_xlabel(x_label)
ax.set_title(title)
</code>Conclusion
The article presented five easy‑to‑use Matplotlib visualizations and demonstrated how abstracting each plot into a function makes the code more readable, reusable, and suitable for rapid data‑exploration workflows.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.