Fundamentals 13 min read

Using Python pandas to Replicate Excel Functions and Visualizations

This article demonstrates how to replace common Excel operations such as data import, VLOOKUP, pivot tables, and charting with Python libraries like pandas and plotly, providing step‑by‑step code examples, performance tips, and comparisons that help analysts transition from spreadsheet‑based workflows to programmatic data analysis.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Using Python pandas to Replicate Excel Functions and Visualizations

Historically, business analysis relied on Excel, but as data volumes grew beyond Excel's row limit (1,048,576 rows), analysts shifted to Python and R, coining the term "Business Analytics". Python's pandas library now offers a mature ecosystem for data manipulation, mirroring many Excel capabilities.

The four primary pandas I/O functions— read_csv , read_excel , to_csv , and to_excel —allow fine‑grained control over import and export, similar to Excel's options. The article loads pandas with:

<code>import numpy as np
import pandas as pd
pd.set_option('max_columns', 10)
pd.set_option('max_rows', 20)
pd.set_option('display.float_format', lambda x: '%.2f' % x)  # disable scientific notation</code>

Key Excel‑like functionalities demonstrated include:

Reading a specific range of an Excel sheet using pd.read_excel('test.xlsx', header=1, nrows=17, usecols=3) .

Implementing VLOOKUP logic in Python by defining a grading function and applying it to a DataFrame.

Performing left joins (equivalent to VLOOKUP across tables) with df2.merge(df1[['编号','月折旧额']], how='left', on='编号') .

Handling approximate lookups by iterating over rows and matching substrings.

Auto‑populating related fields (e.g., employee details) using df5.merge(df4[['工号','姓名','部门','职务','入职日期']], on='工号') .

For aggregation tasks, the article compares Excel pivot tables with pandas groupby and pivot_table approaches, showing how to compute monthly sales, costs, and profit per region efficiently:

<code>df['订购月份'] = df['订购日期'].apply(lambda x: x.month)
agg = df.groupby(['订购月份', '所属区域'])[['销售额','成本']].sum()
agg['利润'] = agg['销售额'] - agg['成本']
# or using pivot_table
pt = pd.pivot_table(df, values=['销售额','成本'], index=['订购月份','所属区域'], aggfunc='sum')
pt['利润'] = pt['销售额'] - pt['成本']
</code>

Visualization is covered using Plotly for interactive charts. Examples include stacked bar charts, radar charts, and custom styling, with code such as:

<code>import plotly.offline as off
import plotly.graph_objs as go
off.init_notebook_mode()
trace1 = go.Bar(x=df['国家'], y=df[1995], name='1995', opacity=0.6, marker=dict(color='powderblue'))
# ... additional traces ...
layout = go.Layout(barmode='group')
fig = go.Figure(data=[trace1, trace2, trace3], layout=layout)
off.plot(fig)
</code>

The article concludes that while Excel remains convenient for simple charts, Python offers greater flexibility and automation for complex analyses, making it a preferred tool for data professionals.

Pythondata analysisvisualizationExcelpandasPlotlyVLOOKUP
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.