Backend Development 5 min read

Introducing DrissionPage: A Python Web Automation Tool Combining Browser Control and Requests

DrissionPage is a Python-based web automation library that merges browser-driven interaction with high‑efficiency request handling, offering a concise syntax, faster execution, and features like iframe handling, shadow‑root processing, and full‑page screenshots, making web scraping and testing more accessible.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Introducing DrissionPage: A Python Web Automation Tool Combining Browser Control and Requests

DrissionPage is a Python web automation tool that can control browsers and send/receive network packets, combining the convenience of browser automation with the efficiency of the requests library.

Background

Using requests for data collection on sites that require login often involves analyzing packets, handling JavaScript, dealing with captchas, obfuscation, and signature parameters, which reduces development efficiency. Browsers can bypass many of these obstacles but are slower.

The library was designed to merge these approaches, enabling fast development and execution, allowing mode switching, and providing a user‑friendly API that abstracts details so developers can focus on functionality.

Earlier versions wrapped Selenium; since version 3.0 the core was rewritten to remove Selenium dependency, enhancing features and performance.

Core Capabilities

No webdriver characteristics.

No need to download drivers for different browser versions.

Faster runtime speed.

Can locate elements across iframes without switching contexts.

Treats iframes as normal elements, simplifying logic.

Allows simultaneous operation on multiple browser tabs, even when inactive.

Can directly read browser cache to save images without GUI interaction.

Supports full‑page screenshots, including areas outside the viewport (supported in browsers version 90+).

Handles non‑open shadow‑root elements.

Getting Started Demo

The SessionPage object operates in "s" mode, using the requests library to access pages while providing a Page Object Model (POM) interface.

Example code demonstrates creating a SessionPage, navigating to a URL, locating h3 elements, extracting links, and printing their text and href attributes.

<code># 导入
from DrissionPage import SessionPage
# 创建页面对象
page = SessionPage()
# 访问网页
page.get('https://gitee.com/explore/all')
# 在页面中查找元素
items = page.eles('t:h3')
# 遍历元素
for item in items[:-1]:
    # 获取当前<h3>元素下的<a>元素
    lnk = item('tag:a')
    # 打印<a>元素文本和href属性
    print(lnk.text, lnk.link)
</code>

The script outputs the titles and links of the explored items, demonstrating how DrissionPage simplifies data extraction compared to using requests + BeautifulSoup.

For more details, visit the author's documentation site: https://g1879.gitee.io/drissionpagedocs/.

Additional resources and recommended reading about Python and data scraping are provided via links at the end of the article.

pythonrequestsWeb Automationselenium-alternativedrissionpagescraping
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.