@eLRuLL the answer is wonderful. I want to add an element conversion part. First, we will make it clear that the callback function only works until the response to this request is reloaded.
in the scrapy.doc code, it does not declare the url and request of page 1 and. Let page url be given as " http: //www.example.com.html ".
[parse_page1] is a callback
scrapy.Request("http://www.example.com.html",callback=parse_page1)`
[parse_page2] is a callback
scrapy.Request("http://www.example.com/some_page.html",callback=parse_page2)
when the response of page1 is loaded, parse_page1 is called to generate a page2 request:
item['main_url'] = response.url # send "http://www.example.com.html" to item request = scrapy.Request("http://www.example.com/some_page.html", callback=self.parse_page2) request.meta['item'] = item # store item in request.meta
after loading the response of page2, parse_page2 is called to retrieve the element:
item = response.meta['item'] #response.meta is equal to request.meta,so here item['main_url'] ="http://www.example.com.html". item['other_url'] = response.url # response.url ="http://www.example.com/some_page.html" return item #finally,we get the item recordind urls of page1 and page2.
xingpei pang
source share