To pass arguments using a scan command
myspider scan scrapy -a category = 'mycategory' -a domain = 'example.com'
To pass arguments to run on scrapyd, replace -a with -d
curl http://your.ip.address.here:port/schedule.json -d spider = myspider -d category = 'mycategory' -d domain = 'example.com'
The spider will receive arguments in its constructor.
class MySpider(Spider): name="myspider" def __init__(self,category='',domain='', *args,**kwargs): super(MySpider, self).__init__(*args, **kwargs) self.category = category self.domain = domain
Scrapy places all arguments as attributes of the spider, and you can completely skip the init method. Beware of using the getattr method to get these attributes so that your code doesn't break .
class MySpider(Spider): name="myspider" start_urls = ('https://httpbin.org/ip',) def parse(self,response): print getattr(self,'category','') print getattr(self,'domain','')
Hassan Raza Sep 08 '15 at 11:33 2015-09-08 11:33
source share