Fundamentals 6 min read

Using python-docx: Document Structure and Basic Operations

This article introduces the python‑docx library, explains its document model—including Document, Paragraph, Run, and Table objects—and provides practical Python code examples for creating, modifying, and styling Word documents, inserting headings, page breaks, tables, and images.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Using python-docx: Document Structure and Basic Operations

python-docx is a Python library for creating and modifying Microsoft Word (.docx) files. It treats an entire document as a Document object with a hierarchical structure.

The Document contains multiple Paragraph objects stored in document.paragraphs . Each Paragraph consists of several Run objects stored in paragraph.runs , which represent inline elements sharing the same text style.

Runs are the smallest units; text style changes cause python‑docx to split text into separate runs. Tables are accessed via document.tables ; each Table has rows, columns, and cells, and each cell contains its own paragraphs.

Below are common operations using python‑docx:

<code>from docx import Document</code>
<code>from docx.enum.text import WD_BREAK</code>
<code>from docx.shared import Cm</code>
<code>from docx.oxml.ns import qn</code>
<code>document = Document('测试文档.docx')</code>
<code>for p in document.paragraphs:</code>
<code>    print(p.text)</code>
<code>document.add_paragraph('这是新的段落内容')</code>
<code># 添加标题</code>
<code># 默认情况下添加的标题是最高一级的,即一级标题,通过参数 level 设定,范围是 1 ~ 9</code>
<code>document.add_heading('我是一级标题')</code>
<code>document.add_heading('我是二级标题', level=2)</code>
<code># 添加换页</code>
<code># 如果一个段落不满一页,需要分页时,可以插入一个分页符,直接调用会将分页符插入到最后一个段落之后:</code>
<code>document.add_page_break()</code>
<code>paragraph = document.add_paragraph("独占一页")  # 添加一个段落</code>
<code>paragraph.runs[-1].add_break(WD_BREAK.PAGE)  # 在段落的最后一个节段后添加分页</code>
<code># 表格操作</code>
<code>table = document.add_table(rows=2, cols=2)</code>
<code>cell = table.cell(0, 1)  # 获取第一行第二列单元格</code>
<code>cell.text = '我是单元格文字'  # 设置单元格文本</code>
<code>row = table.rows[1]  # 表格的行</code>
<code>row.cells[0].text = 'Foo bar to you.'</code>
<code>row.cells[1].text = 'And a hearty foo bar to you too sir!'</code>
<code>row = table.add_row()  # 增加行</code>
<code># 添加表格(复杂)</code>
<code>items = ((7, '1024', '手机'), (3, '2042', '笔记本'), (1, '1288', '台式机'),)</code>
<code>table = document.add_table(1, 3)</code>
<code>heading_cells = table.rows[0].cells</code>
<code>heading_cells[0].text = '数量'</code>
<code>heading_cells[1].text = '编码'</code>
<code>heading_cells[2].text = '描述'</code>
<code>for item in items:</code>
<code>    cells = table.add_row().cells</code>
<code>    cells[0].text = str(item[0])</code>
<code>    cells[1].text = item[1]</code>
<code>    cells[2].text = item[2]</code>
<code># 添加图片</code>
<code>document.add_picture('docx.png', width=Cm(10))</code>
<code># 样式</code>
<code># 样式可以针对整体文档(document)、段落(paragraph)、节段(run),越具体,样式优先级越高</code>
<code># 段落样式</code>
<code># 段落样式包括:对齐、列表样式、行间距、缩进、背景色等,可以在添加段落时设定,也可以在添加之后设置:</code>
<code>document.add_paragraph('Header', style='Header')</code>
<code># 文字样式</code>
<code># 设置加粗/斜体</code>
<code>paragraph = document.add_paragraph('添加一个段落')</code>
<code>run = paragraph.add_run('添加一个节段')</code>
<code>run.bold = True</code>
<code>run = paragraph.add_run('我是斜体的')</code>
<code>run.italic = True</code>
<code># 设置字体</code>
<code>paragraph = document.add_paragraph('我的字体是 宋体')</code>
<code>run = paragraph.runs[0]</code>
<code>run.font.name = '宋体'</code>
<code>run._element.rPr.rFonts.set(qn('w:eastAsia'), '微软雅黑')  # 设置完这个才生效</code>
PythonCode Exampledocument processingpython-docxword-automation
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.