Tagged articles
3 articles
Page 1 of 1
MaGe Linux Operations
MaGe Linux Operations
Oct 27, 2020 · Backend Development

Build a Distributed Scrapy Crawler in Minutes with RabbitMQ and RedisBloom

This guide walks you through installing Scrapy-Distributed, setting up RabbitMQ and RedisBloom containers, creating a sitemap spider, configuring the distributed scheduler and dupefilter, and running the spider, while explaining why this non‑intrusive solution improves over existing Scrapy‑Redis and scrapy‑rabbitmq approaches.

PythonRabbitMQRedisBloom
0 likes · 7 min read
Build a Distributed Scrapy Crawler in Minutes with RabbitMQ and RedisBloom