Master Plotly Sankey Diagrams: From Web Traffic to Energy Flow
This article explains the origins and applications of Sankey diagrams, demonstrates how to create them with Plotly in Python across various scenarios such as website navigation, energy conversion, cost breakdown, financial flows, data migration, and confusion matrix visualization, and provides complete code examples.
Sankey diagrams, originating in the 19th century, are unique flow charts that represent the transfer of quantities between multiple entities or stages, evolving from simple energy flow representations to powerful tools for diverse applications.
Sankey diagrams are flow charts used to depict flows between two or more sets of nodes, especially suitable for showing the magnitude of flow between sources and targets, as the line width is proportional to the flow amount.
Common applications of Sankey diagrams include:
Energy flow: showing how energy moves from source to end users, e.g., solar energy conversion.
Cost structure: illustrating how product costs are allocated across components.
Website traffic: tracking user navigation on a site.
Transaction flow: visualizing movement of money, goods, or services among entities.
Data migration: tracing data flow between databases or servers.
Confusion matrix: comparing predicted vs. actual values in classification.
Example: tracking user navigation in an online store from homepage to product page, then to cart, and finally to checkout.
Implementation using the Plotly library:
<code>import plotly.graph_objects as go
# Data definition
labels = ["Homepage", "Product Page", "Cart", "Checkout", "Exit"]
source = [0, 0, 1, 1, 1, 2, 2, 2, 3]
target = [1, 4, 2, 3, 4, 3, 4, 1, 4]
value = [100, 10, 50, 30, 20, 40, 5, 5, 35]
# Create Sankey diagram
fig = go.Figure(data=[go.Sankey(
node=dict(
pad=15,
thickness=20,
line=dict(color="black", width=0.5),
label=labels
),
link=dict(
source=source,
target=target,
value=value
))])
fig.show()
</code>The code displays a Sankey diagram describing how users move from the homepage to the checkout page and where they drop off.
Energy Flow
Illustrating how solar energy is converted into electrical and thermal energy.
<code>import plotly.graph_objects as go
# Data
labels = ["Solar Energy", "Conversion System", "Electrical Energy", "Thermal Energy"]
source = [0, 0, 1, 1]
target = [1, 1, 2, 3]
value = [100, 100, 60, 40]
# Create Sankey diagram
fig = go.Figure(go.Sankey(
node=dict(label=labels),
link=dict(source=source, target=target, value=value)
))
fig.update_layout(title_text="Solar Energy Conversion")
fig.show()
</code>Cost Structure
Considering the manufacturing cost of a smartphone.
<code>labels = ["Total Cost", "Components", "Labor", "Marketing", "Others"]
source = [0, 0, 0, 0]
target = [1, 2, 3, 4]
value = [300, 50, 100, 50]
fig = go.Figure(go.Sankey(
node=dict(label=labels),
link=dict(source=source, target=target, value=value)
))
fig.update_layout(title_text="Phone Manufacturing Cost Structure")
fig.show()
</code>Transaction Flow
Illustrating a simple money flow.
<code>labels = ["Bank", "Online Store", "Physical Store", "Taxes"]
source = [0, 0, 1]
target = [1, 2, 3]
value = [150, 50, 40]
fig = go.Figure(go.Sankey(
node=dict(label=labels),
link=dict(source=source, target=target, value=value)
))
fig.update_layout(title_text="Money Flow")
fig.show()
</code>Data Migration
Considering data flow from one database to another.
<code>labels = ["Database A", "Database B", "Errors", "Success"]
source = [0, 0, 1]
target = [1, 2, 3]
value = [200, 5, 195]
fig = go.Figure(go.Sankey(
node=dict(label=labels),
link=dict(source=source, target=target, value=value)
))
fig.update_layout(title_text="Data Migration")
fig.show()
</code>Confusion Matrix
Using a Sankey diagram to visualize a binary classification confusion matrix, showing flows between actual and predicted positives and negatives.
<code>labels = ["Actual Positive", "Actual Negative", "Predicted Positive", "Predicted Negative"]
source = [0, 0, 1, 1]
target = [2, 3, 3, 2]
value = [40, 10, 5, 45]
fig = go.Figure(go.Sankey(
node=dict(label=labels),
link=dict(source=source, target=target, value=value)
))
fig.update_layout(title_text="Confusion Matrix Visualization")
fig.show()
</code>Plotly creates Sankey diagrams by defining node labels, specifying source, target, and value lists for links, constructing the diagram with go.Sankey , customizing layout with update_layout , and displaying it with show . Larger flow values produce wider links, providing an intuitive visual representation of flow magnitude.
Data scientists, market researchers, and anyone interested in data visualization should explore Sankey diagrams to better understand and communicate their data stories.
Give it a try!
Model Perspective
Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.