Tag

HTML parsing

0 views collected around this technical thread.

Spring Full-Stack Practical Cases
Spring Full-Stack Practical Cases
Apr 25, 2025 · Backend Development

Master jsoup: Real‑World Spring Boot 3 Examples for HTML Parsing

This tutorial walks through practical jsoup usage within Spring Boot 3, covering dependency setup, parsing HTML from strings, fragments, URLs or files, extracting titles, links, images, applying CSS selectors, modifying elements, and sanitizing content to prevent XSS attacks.

HTML parsingJavaJsoup
0 likes · 10 min read
Master jsoup: Real‑World Spring Boot 3 Examples for HTML Parsing
Code Mala Tang
Code Mala Tang
Apr 19, 2025 · Fundamentals

Master HTML Parsing in Python: BeautifulSoup, lxml, and html.parser Compared

Learn why HTML parsing is essential for web scraping, explore three popular Python libraries—BeautifulSoup, lxml, and the built‑in html.parser—covering installation, core usage, advanced techniques, and a comparative analysis to help you choose the right tool for your project.

BeautifulSoupHTML parsingPython
0 likes · 11 min read
Master HTML Parsing in Python: BeautifulSoup, lxml, and html.parser Compared
Python Programming Learning Circle
Python Programming Learning Circle
Dec 28, 2024 · Backend Development

Getting Started with requests-html: Installation, Basic Usage, Advanced Features, and Web Scraping Examples

This article introduces the Python requests-html library, covering its installation, basic operations such as fetching pages, extracting links and elements, advanced capabilities like JavaScript rendering, pagination, custom requests, and provides practical web‑scraping examples for sites like Jianshu and Tianya.

AutomationHTML parsingWeb Scraping
0 likes · 16 min read
Getting Started with requests-html: Installation, Basic Usage, Advanced Features, and Web Scraping Examples
Python Programming Learning Circle
Python Programming Learning Circle
Aug 10, 2022 · Backend Development

Python Web Scraping Tutorial: Using requests and BeautifulSoup to Extract Weather Data

This article demonstrates how to use Python's requests library and BeautifulSoup to inspect webpage source, set request headers, fetch weather page HTML, parse it with CSS selectors, extract daytime and nighttime temperatures, and extend the script to handle multiple cities, providing complete code examples.

BeautifulSoupHTML parsingRequests
0 likes · 7 min read
Python Web Scraping Tutorial: Using requests and BeautifulSoup to Extract Weather Data
Sohu Tech Products
Sohu Tech Products
May 18, 2022 · Fundamentals

Overview of a Web Page Content Extraction Algorithm and Its Practical Demo

This article introduces a web page content extraction algorithm that automatically structures titles, timestamps, body text, authors, and sources from arbitrary news pages, explains how to use an online demo, compares it with existing solutions, and discusses its broader applications and limitations.

AlgorithmGNEHTML parsing
0 likes · 8 min read
Overview of a Web Page Content Extraction Algorithm and Its Practical Demo
Baidu Geek Talk
Baidu Geek Talk
Mar 21, 2022 · Frontend Development

How WebKit Parses HTML: Decoding, Tokenization, and DOM Tree Construction

The article details WebKit’s rendering pipeline in WKWebView, describing how the network process streams HTML bytes to the rendering process, which decodes them via TextResourceDecoder, tokenizes the characters with HTMLTokenizer’s state machine, and constructs an efficient DOM tree using HTMLTreeBuilder and queued insertion tasks.

Browser EngineDOMHTML parsing
0 likes · 33 min read
How WebKit Parses HTML: Decoding, Tokenization, and DOM Tree Construction
Python Programming Learning Circle
Python Programming Learning Circle
Mar 8, 2022 · Backend Development

XPath Basics and Web Scraping with Python lxml: Concepts, Syntax, and Practical Examples

This tutorial explains the fundamental concepts and parsing principles of XPath, shows how to set up the Python lxml environment, demonstrates instantiating etree objects, details XPath expression syntax, and provides multiple real‑world web‑scraping examples with complete code snippets.

HTML parsingWeb Scrapinglxml
0 likes · 9 min read
XPath Basics and Web Scraping with Python lxml: Concepts, Syntax, and Practical Examples
Baidu App Technology
Baidu App Technology
Mar 7, 2022 · Mobile Development

How WKWebView Parses HTML: Decoding, Tokenization, and DOM Tree Construction

WKWebView parses HTML by streaming bytes from the network process to the rendering process, decoding them into characters, tokenizing into HTML tokens, building a DOM tree through node creation and insertion, and finally laying out and painting the document using a doubly‑linked in‑memory structure.

DOMHTML parsingWKWebView
0 likes · 37 min read
How WKWebView Parses HTML: Decoding, Tokenization, and DOM Tree Construction
Python Programming Learning Circle
Python Programming Learning Circle
Apr 12, 2021 · Backend Development

Common Regular Expressions and Methods for Python Web Scraping

This article presents a practical collection of Python regular‑expression techniques for extracting HTML elements such as table rows, links, titles, images, and scripts, showing how to filter tags and handle URL parameters during web crawling.

Data ExtractionHTML parsingPython
0 likes · 20 min read
Common Regular Expressions and Methods for Python Web Scraping
Python Programming Learning Circle
Python Programming Learning Circle
Mar 31, 2021 · Backend Development

Getting Started with requests-html: Installation, Basic Usage, and Advanced Features

This article introduces the Python requests-html library, covering its installation, basic operations such as fetching pages, extracting links and elements with CSS and XPath selectors, advanced capabilities like JavaScript rendering, pagination handling, custom request options, and practical web‑scraping examples.

HTML parsingJavaScript renderingPython
0 likes · 16 min read
Getting Started with requests-html: Installation, Basic Usage, and Advanced Features
Python Programming Learning Circle
Python Programming Learning Circle
Jan 14, 2021 · Big Data

Python Web Scraping Tutorial with Selenium and BeautifulSoup

This tutorial demonstrates how to create a Python web scraper using Selenium and BeautifulSoup, covering login automation, HTML retrieval, parsing with html5lib, data extraction from tables, and strategies for handling anti‑scraping measures such as headless browsing and proxy usage.

BeautifulSoupData ExtractionHTML parsing
0 likes · 7 min read
Python Web Scraping Tutorial with Selenium and BeautifulSoup
Xianyu Technology
Xianyu Technology
Apr 16, 2020 · Mobile Development

Design and Implementation of RichText Mixed Content in Flutter for Xianyu Messaging

The article details Xianyu’s migration of its messaging rich‑text system to Flutter, explaining how RichText became a MultiChildRenderObjectWidget, how custom emoji placeholders are converted to HTML tags and parsed into TextSpan and WidgetSpan elements, enabling colored text, clickable links, and emoji rendering across Flutter versions.

FlutterHTML parsingRichText
0 likes · 9 min read
Design and Implementation of RichText Mixed Content in Flutter for Xianyu Messaging
Python Programming Learning Circle
Python Programming Learning Circle
Apr 10, 2020 · Fundamentals

Introduction to BeautifulSoup (bs4) for HTML/XML Parsing in Python

This article introduces BeautifulSoup, a Python library for parsing HTML/XML, explains how to import it, choose among parsers, demonstrates tag navigation, searching with find/find_all, CSS selection, and tree traversal methods, and provides extensive code examples.

BeautifulSoupHTML parsingPython
0 likes · 13 min read
Introduction to BeautifulSoup (bs4) for HTML/XML Parsing in Python
Python Programming Learning Circle
Python Programming Learning Circle
Dec 12, 2019 · Backend Development

Master Web Scraping with Python: Requests + BeautifulSoup Step‑by‑Step

This tutorial walks you through using Python's requests library to fetch a web page and BeautifulSoup4 to parse HTML, covering object creation, common attributes, tag properties, and the find() / find_all() methods for extracting specific content.

BeautifulSoupHTML parsingPython
0 likes · 6 min read
Master Web Scraping with Python: Requests + BeautifulSoup Step‑by‑Step