Passing an argument to a callback function

def parse(self, response): for sel in response.xpath('//tbody/tr'): item = HeroItem() item['hclass'] = response.request.url.split("/")[8].split('-')[-1] item['server'] = response.request.url.split('/')[2].split('.')[0] item['hardcore'] = len(response.request.url.split("/")[8].split('-')) == 3 item['seasonal'] = response.request.url.split("/")[6] == 'season' item['rank'] = sel.xpath('td[@class="cell-Rank"]/text()').extract()[0].strip() item['battle_tag'] = sel.xpath('td[@class="cell-BattleTag"]//a/text()').extract()[1].strip() item['grift'] = sel.xpath('td[@class="cell-RiftLevel"]/text()').extract()[0].strip() item['time'] = sel.xpath('td[@class="cell-RiftTime"]/text()').extract()[0].strip() item['date'] = sel.xpath('td[@class="cell-RiftTime"]/text()').extract()[0].strip() url = 'https://' + item['server'] + '.battle.net/' + sel.xpath('td[@class="cell-BattleTag"]//a/@href').extract()[0].strip() yield Request(url, callback=self.parse_profile) def parse_profile(self, response): sel = Selector(response) item = HeroItem() item['weapon'] = sel.xpath('//li[@class="slot-mainHand"]/a[@class="slot-link"]/@href').extract()[0].split('/')[4] return item 

Well, I clear the whole table in the main parsing method, and I took a few fields from this table. One of these fields is a URL, and I want to examine it to get a whole new group of fields. How do I pass my already created ITEM object to a callback function so that the last element retains all fields?

As shown in the above code, I can save the fields inside the url (code at the moment) or only those that are listed in the table (just write yield item ) but I can’t give only one object with all the fields together.

I tried this, but obviously this does not work.

 yield Request(url, callback=self.parse_profile(item)) def parse_profile(self, response, item): sel = Selector(response) item['weapon'] = sel.xpath('//li[@class="slot-mainHand"]/a[@class="slot-link"]/@href').extract()[0].split('/')[4] return item 
+14
python callback arguments scrapy
source share
2 answers

This is what you would use the meta keyword for.

 def parse(self, response): for sel in response.xpath('//tbody/tr'): item = HeroItem() # Item assignment here url = 'https://' + item['server'] + '.battle.net/' + sel.xpath('td[@class="cell-BattleTag"]//a/@href').extract()[0].strip() yield Request(url, callback=self.parse_profile, meta={'hero_item': item}) def parse_profile(self, response): item = response.meta.get('hero_item') item['weapon'] = response.xpath('//li[@class="slot-mainHand"]/a[@class="slot-link"]/@href').extract()[0].split('/')[4] yield item 

Also note: doing sel = Selector(response) is a waste of resources and is different from what you did before, so I changed it. It is automatically displayed in response as response.selector , which also has a convenient response.xpath shortcut.

+33
source share

I had a similar problem with passing additional arguments to Tkinter and found this solution to work (here: http://infohost.nmt.edu/tcc/help/pubs/tkinter/web/extra-args.html ), converted to your problem :

 def parse(self, response): item = HeroItem() [...] def handler(self = self, response = response, item = item): """ passing as default argument values """ return self.parse_profile(response, item) yield Request(url, callback=handler) 
0
source share

All Articles