Using openpyxl to Operate Excel Files in Python
This article provides a comprehensive guide to using the openpyxl library for creating, opening, editing, and saving Excel workbooks in Python, covering worksheet management, cell operations, merging, iteration, and efficient handling of large files with code examples.
Python offers several libraries for Excel manipulation, and openpyxl is a popular choice due to its balance of functionality and performance. The following sections demonstrate common tasks such as creating a new workbook, opening existing files, and configuring read‑only or write‑only modes for large datasets.
Creating a workbook
<code>from openpyxl import Workbook
wb = Workbook()
</code>Opening an existing workbook
<code>from openpyxl import load_workbook
wb2 = load_workbook('test.xlsx')
</code>For large files, you can reduce memory usage by using read_only=True or write_only=True when loading or creating a workbook.
<code>wb = load_workbook(filename='large_file.xlsx', read_only=True)
wb = Workbook(write_only=True)
</code>Worksheet handling
Access the active sheet with ws = wb.active or create new sheets at specific positions:
<code>ws1 = wb.create_sheet('Mysheet') # append at end (default)
ws2 = wb.create_sheet('Mysheet', 0) # insert at first position
ws3 = wb.create_sheet('Mysheet', -1) # insert before last sheet
</code>Retrieve a sheet by name ( ws = wb['SheetName'] ) and list all sheet names with wb.sheetnames .
<code>print(wb.sheetnames)
# ['Sheet2', 'New Title', 'Sheet1']
</code>Saving workbooks
You can save to a temporary stream for network transmission or directly to a file:
<code>from tempfile import NamedTemporaryFile
wb = Workbook()
with NamedTemporaryFile() as tmp:
wb.save(tmp.name)
tmp.seek(0)
stream = tmp.read()
wb.save('balances.xlsx')
</code>To save as a template, set wb.template = True before saving with an .xltx extension.
<code>wb = load_workbook('document.xlsx')
wb.template = True
wb.save('document_template.xltx')
</code>Cell operations
Read and write cell values using the A1 notation:
<code>c = ws['A4']
ws['A4'] = 4
c.value = 'hello, world'
</code>Access ranges with slicing, set number formats, and add formulas:
<code>cell_range = ws['A1':'C2']
ws['A1'] = datetime.datetime(2010, 7, 21)
print(ws['A1'].number_format) # 'yyyy-mm-dd h:mm:ss'
ws['A1'] = '=SUM(1, 1)'
</code>Merge and unmerge cells either by range string or by specifying start/end rows and columns:
<code>ws.merge_cells('A2:D2')
ws.unmerge_cells('A2:D2')
ws.merge_cells(start_row=2, start_column=1, end_row=4, end_column=4)
ws.unmerge_cells(start_row=2, start_column=1, end_row=4, end_column=4)
</code>Row and column manipulation
Retrieve entire rows or columns, iterate over them, or insert/delete rows and columns:
<code>colC = ws['C']
col_range = ws['C:D']
row10 = ws[10]
row_range = ws[5:10]
for row in ws.iter_rows(min_row=1, max_col=3, max_row=2):
for cell in row:
print(cell)
for col in ws.iter_cols(min_row=1, max_col=3, max_row=2):
for cell in col:
print(cell)
ws.insert_rows(7)
ws.insert_cols(3)
ws.delete_rows(5, 2)
ws.delete_cols(2, 1)
</code>Reading values only
Use ws.values or set values_only=True in iter_rows / iter_cols to retrieve just the cell contents:
<code>for row in ws.values:
for value in row:
print(value)
for row in ws.iter_rows(min_row=1, max_col=3, max_row=2, values_only=True):
print(row)
</code>Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.