Fundamentals 5 min read

Comparing Multiple Sheets in Two Excel Files Using Python Pandas

This guide walks you through installing pandas and openpyxl, reading all sheets from two Excel workbooks, comparing matching sheets to identify differences, analyzing those differences, and saving the results to a new Excel file, with a complete code example provided.

Test Development Learning Exchange
Test Development Learning Exchange
Test Development Learning Exchange
Comparing Multiple Sheets in Two Excel Files Using Python Pandas

In data processing and analysis, comparing datasets from different sources—especially multiple Excel workbooks and sheets—is a common task; Python's pandas library offers powerful tools to accomplish this efficiently.

Step 1: Install required libraries Ensure pandas and openpyxl are installed in your Python environment, e.g., pip install pandas openpyxl .

Step 2: Read multiple sheets from Excel files Use pandas.ExcelFile or pandas.read_excel to load all sheets from each workbook:

import pandas as pd
# Read all sheets from the first Excel file
xlsx1 = pd.ExcelFile('file1.xlsx')
sheets1 = {sheet_name: xlsx1.parse(sheet_name) for sheet_name in xlsx1.sheet_names}
# Read all sheets from the second Excel file
xlsx2 = pd.ExcelFile('file2.xlsx')
sheets2 = {sheet_name: xlsx2.parse(sheet_name) for sheet_name in xlsx2.sheet_names}

Step 3: Compare data Iterate over matching sheet names, merge the DataFrames with an outer join, and capture rows that differ:

# Create a dictionary to store comparison results
comparison_results = {}
for sheet_name in sheets1.keys():
    if sheet_name in sheets2:
        df1 = sheets1[sheet_name]
        df2 = sheets2[sheet_name]
        comparison = df1.merge(df2, how='outer', indicator=True)
        comparison_results[sheet_name] = comparison[comparison['_merge'] != 'both']

Step 4: Analyze differences The resulting DataFrames contain rows marked as left_only or right_only , indicating which workbook they belong to; you can also access original columns with _x and _y suffixes.

for sheet_name, result in comparison_results.items():
    if not result.empty:
        print(f"Differences found in '{sheet_name}':")
        print(result)

Step 5: Save comparison results Write non‑empty comparison DataFrames to a new Excel file for reporting:

with pd.ExcelWriter('comparison_results.xlsx') as writer:
    for sheet_name, result in comparison_results.items():
        if not result.empty:
            result.to_excel(writer, sheet_name=sheet_name, index=False)

Full code example The following script combines all steps into a single, runnable program:

import pandas as pd
# Read Excel files
xlsx1 = pd.ExcelFile('file1.xlsx')
xlsx2 = pd.ExcelFile('file2.xlsx')
# Read all sheets
sheets1 = {sheet_name: xlsx1.parse(sheet_name) for sheet_name in xlsx1.sheet_names}
sheets2 = {sheet_name: xlsx2.parse(sheet_name) for sheet_name in xlsx2.sheet_names}
# Dictionary for results
comparison_results = {}
# Compare data
for sheet_name in sheets1.keys():
    if sheet_name in sheets2:
        df1 = sheets1[sheet_name]
        df2 = sheets2[sheet_name]
        comparison = df1.merge(df2, how='outer', indicator=True)
        comparison_results[sheet_name] = comparison[comparison['_merge'] != 'both']
# Save results
with pd.ExcelWriter('comparison_results.xlsx') as writer:
    for sheet_name, result in comparison_results.items():
        if not result.empty:
            result.to_excel(writer, sheet_name=sheet_name, index=False)

By following these steps you can efficiently compare multiple sheets across two Excel files, identify discrepancies, and export the findings—useful for financial audits, data cleaning, or any scenario requiring cross‑dataset consistency checks.

data analysisExcelPandasdata comparison
Test Development Learning Exchange
Written by

Test Development Learning Exchange

Test Development Learning Exchange

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.