Scrapy Shell - How to change USER_AGENT

I have a fully functioning scrapy script to retrieve data from a website. During installation, the target site banned me based on my USER_AGENT information. Subsequently, I added RotateUserAgentMiddleware for random rotation of USER_AGENT. This works great.

However, now when I try to use the scrapy shell to check the xpath and css requests, I get an error 403. I am sure that this is due to the fact that the USER_AGENT from the scrapy shell does not correspond to some blacklisted target site value.

Question: Is it possible to get the URL in the scrapy shell with a different USER_AGENT than the default?

fetch (' http: //www.test ') [add something ?? to change USER_AGENT]

thanks

+8
python shell scrapy agent
source share
2 answers

scrapy shell -s USER_AGENT='custom user agent' 'http://www.example.com'

+21
source share

Inside the scrapy shell, you can set the User-Agent in the request header .

 url = 'http://www.example.com' request = scrapy.Request(url, headers={'User-Agent': 'Mybot'}) fetch(request) 
+2
source share

All Articles