I use scrapy to submit the form at https://www.barefootstudent.com/jobs (any links to the page, etc. http://www.barefootstudent.com/los_angeles/jobs/full_time/full_time_nanny_needed_in_venice_217021 )
My scapy bot successfully registered, but I can not avoid captcha. For the submit form, I use scrapy.FormRequest.from_reponse
frq = scrapy.FormRequest.from_response(response, formdata={'message': 'itttttttt',
'security': captcha, 'name': 'fx',
'category_id': '2', 'email': 'ololo%40gmail.com', 'item_id': '216640_2', 'location': '18', 'send_message': 'Send%20Message'
}, callback=self.afterForm)
yield frq
I want to download captcha image from this page and manual input in runtime script. etc.
captcha = raw_input("put captcha in manually>")
I'm trying to
urllib.urlretrieve(captcha, "./captcha.jpg")
But this method loads the wrong captcha (the site rejects my input). I try to call urllib.urlretieve several times in one pass of the script and each time it returns different captchas :(
ImagePipeline.
, ( ) , , yeild.
item = BfsItem()
item['image_urls'] = [captcha]
yield item
captcha = raw_input("put captcha in manually>")
frq = scrapy.FormRequest.from_response(response, formdata={'message': 'itttttttt',
'security': captcha, 'name': 'fx',
'category_id': '2', 'email': 'ololo%40gmail.com', 'item_id': '216640_2', 'location': '18', 'send_message': 'Send%20Message'
}, callback=self.afterForm)
yield frq
, script , !
script FormRequest ?
!