Tag

content-extraction

1 views collected around this technical thread.

Laravel Tech Community
Laravel Tech Community
Apr 2, 2023 · Backend Development

QueryList: A Modern PHP Content Scraping Library – Features, Installation, and Usage Guide

This article introduces QueryList, a modern PHP content‑scraping tool that uses CSS selectors instead of regex, explains its two versions (V3 and V4), shows how to install it via Composer, demonstrates basic crawling code and various collection methods such as flatten, take, reverse, filter, map, and multi‑request concurrency.

ComposerData Processingcontent-extraction
0 likes · 7 min read
QueryList: A Modern PHP Content Scraping Library – Features, Installation, and Usage Guide
Sohu Tech Products
Sohu Tech Products
May 18, 2022 · Fundamentals

Overview of a Web Page Content Extraction Algorithm and Its Practical Demo

This article introduces a web page content extraction algorithm that automatically structures titles, timestamps, body text, authors, and sources from arbitrary news pages, explains how to use an online demo, compares it with existing solutions, and discusses its broader applications and limitations.

AlgorithmGNEHTML parsing
0 likes · 8 min read
Overview of a Web Page Content Extraction Algorithm and Its Practical Demo
Ctrip Technology
Ctrip Technology
Oct 11, 2019 · Artificial Intelligence

Intelligent Content Extraction and Generation Practices on Ctrip's Marco Polo AI Platform

This article details Ctrip's AI‑driven Marco Polo platform, describing how large‑scale NLP pipelines combine extraction, richness evaluation, semantic matching and deep‑learning generation (CopyNet, TA‑seq2seq) to produce high‑quality recommendation reasons across multiple product scenarios.

NLPSparkcontent-extraction
0 likes · 16 min read
Intelligent Content Extraction and Generation Practices on Ctrip's Marco Polo AI Platform