Scrapy Shell - How to change USER_AGENT

Question

Scrapy Shell - How to change USER_AGENT

I have a fully functioning scrapy script to retrieve data from a website. During installation, the target site banned me based on my USER_AGENT information. Subsequently, I added RotateUserAgentMiddleware for random rotation of USER_AGENT. This works great.

However, now when I try to use the scrapy shell to check the xpath and css requests, I get an error 403. I am sure that this is due to the fact that the USER_AGENT from the scrapy shell does not correspond to some blacklisted target site value.

Question: Is it possible to get the URL in the scrapy shell with a different USER_AGENT than the default?

fetch (' http: //www.test ') [add something ?? to change USER_AGENT]

thanks

+8

python shell scrapy agent

dfriestedt Aug 21 '14 at 15:00

source share

2 answers

Inside the scrapy shell, you can set the User-Agent in the request header .

 url = 'http://www.example.com' request = scrapy.Request(url, headers={'User-Agent': 'Mybot'}) fetch(request)

+2

salman wahed Oct 19 '16 at 15:57

source share

marven · Accepted Answer · 2014-08-22T01:15:01+0000

scrapy shell -s USER_AGENT='custom user agent' 'http://www.example.com'

Scrapy Shell - How to change USER_AGENT

More articles: