Tag

crawler frameworks

1 views collected around this technical thread.

Qunar Tech Salon
Qunar Tech Salon
Nov 30, 2015 · Backend Development

Choosing a Web Crawler: Nutch, Crawler4j, WebMagic, WebCollector, Scrapy, or Others

This article compares distributed, Java‑based, and non‑Java web crawlers—examining Nutch, Crawler4j, WebMagic, WebCollector, Scrapy and alternatives—highlighting their strengths, limitations, and suitability for tasks such as data extraction, multi‑threading, AJAX handling, and search‑engine construction.

JavaNutchPython
0 likes · 11 min read
Choosing a Web Crawler: Nutch, Crawler4j, WebMagic, WebCollector, Scrapy, or Others