I sometimes looked for an e-commerce web page to get product pricing information. I did not use a scraper created using Scrapy, and yesterday I tried to use it - I had a problem with protecting bots.
It uses DDOS CloudFlares protection, which mainly uses JavaScript evaluation to filter browsers (and therefore the scraper) with JS disabled. As soon as the function is evaluated, a response with a calculated number is generated. In turn, the service sends back two authentication cookies, which are attached to each request, which usually allow you to crawl the site. Here is a description of how this works.
I also found a cloudflare-scrape Python module that uses an external JS evaluation engine to calculate the number and send the request back to the server. I am not sure how to integrate it into Scrapy . Or maybe a more reasonable way without using JS execution? In the end, it is a form ...
I would advise any help.
javascript python cookies scrapy
Cloudide
source share