Send request using python to asp.net page

I need to cancel PINCODE from http://www.indiapost.gov.in/pin/ ", I am doing the following code.

import urllib import urllib2 headers = { 'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 'Origin': 'http://www.indiapost.gov.in', 'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.57 Safari/537.17', 'Content-Type': 'application/x-www-form-urlencoded', 'Referer': 'http://www.indiapost.gov.in/pin/', 'Accept-Encoding': 'gzip,deflate,sdch', 'Accept-Language': 'en-US,en;q=0.8', 'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3' } viewstate = 'JulXDv576ZUXoVOwThQQj4bDuseXWDCZMP0tt+HYkdHOVPbx++G8yMISvTybsnQlNN76EX/...' eventvalidation = '8xJw9GG8LMh6A/b6/jOWr970cQCHEj95/6ezvXAqkQ/C1At06MdFIy7+iyzh7813e1/3Elx...' url = 'http://www.indiapost.gov.in/pin/' formData = ( ('__EVENTVALIDATION', eventvalidation), ('__EVENTTARGET',''), ('__EVENTARGUMENT',''), ('__VIEWSTATE', viewstate), ('__VIEWSTATEENCRYPTED',''), ('__EVENTVALIDATION', eventvalidation), ('txt_offname',''), ('ddl_dist','0'), ('txt_dist_on',''), ('ddl_state','2'), ('btn_state','Search'), ('txt_stateon',''), ('hdn_tabchoice','3') ) from urllib import FancyURLopener class MyOpener(FancyURLopener): version = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.57 Safari/537.17' myopener = MyOpener() encodedFields = urllib.urlencode(formData) f = myopener.open(url, encodedFields) print f.info() try: fout = open('tmp.txt', 'w') except: print('Could not open output file\n') fout.writelines(f.readlines()) fout.close() 

I get a response from the server as "Sorry that this site has a serious problem, try reloading the page or contacting the webmaster." pl tell me where I'm going wrong.

+8
python web-scraping
source share
1 answer

Where did you get the value of viewstate and eventvalidation ? On the one hand, they should not end with the words "...", you must have omitted something. On the other hand, they should not be hardcoded.

One solution:

  • Get the page at the URL http://www.indiapost.gov.in/pin/ "without any form data
  • Parse and retrieve form values ​​like __VIEWSTATE and __EVENTVALIDATION (you can use BeautifulSoup ).
  • Get the search result (second HTTP request) by adding the important form data from step 2.

UPDATE

In accordance with the above idea, I modify your code a bit to make it work:

 import urllib from bs4 import BeautifulSoup headers = { 'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 'Origin': 'http://www.indiapost.gov.in', 'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.57 Safari/537.17', 'Content-Type': 'application/x-www-form-urlencoded', 'Referer': 'http://www.indiapost.gov.in/pin/', 'Accept-Encoding': 'gzip,deflate,sdch', 'Accept-Language': 'en-US,en;q=0.8', 'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3' } class MyOpener(urllib.FancyURLopener): version = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.57 Safari/537.17' myopener = MyOpener() url = 'http://www.indiapost.gov.in/pin/' # first HTTP request without form data f = myopener.open(url) soup = BeautifulSoup(f) # parse and retrieve two vital form values viewstate = soup.select("#__VIEWSTATE")[0]['value'] eventvalidation = soup.select("#__EVENTVALIDATION")[0]['value'] formData = ( ('__EVENTVALIDATION', eventvalidation), ('__VIEWSTATE', viewstate), ('__VIEWSTATEENCRYPTED',''), ('txt_offname', ''), ('ddl_dist', '0'), ('txt_dist_on', ''), ('ddl_state','1'), ('btn_state', 'Search'), ('txt_stateon', ''), ('hdn_tabchoice', '1'), ('search_on', 'Search'), ) encodedFields = urllib.urlencode(formData) # second HTTP request with form data f = myopener.open(url, encodedFields) try: # actually we'd better use BeautifulSoup once again to # retrieve results(instead of writing out the whole HTML file) # Besides, since the result is split into multipages, # we need send more HTTP requests fout = open('tmp.html', 'w') except: print('Could not open output file\n') fout.writelines(f.readlines()) fout.close() 
+16
source share

All Articles