Submitted form data using HtppWebRequest does not affect

I came across a website that seems simple enough, and I was sure that I could read its data using HttpWebRequest and be able to execute GET and POST requests. GET requests work fine. The POST request also does not generate any errors, but still the published form data does not affect the returned results. The published form data has fields for filtering data according to dates, but no matter what all the necessary data is placed, the returned data is not filtered. I added every header, form data, and also added request cookies.

Webpage URL http://www.bseindia.com/corporates/Insider_Trading_new.aspx?expandable=0

This seems like a very normal website, but since it is an aspx page and includes ViewState and Event Validation, therefore, it was not very simple.

My first step was to analyze the GET and POST site using Fiddler, and this amazes me because Fiddler does not capture traffic for this URL. I tried Charles, but he himself does not commit this URL. Others, then this Url Fiddler and Charles both take over the rest. I would also like to mention that when I called Url from a console application using HttpWebRequest, both Fiddler and Charles captured it, but they did not capture it from Chrome, FireFox and Internet Explorer 11.

So, I analyzed the network activity using the Developer tool in FireFox, and everything was visible that included (headers, parameters and cookies). There were no cookies in Chrome. When I check the cookies by creating an HttpWebRequest and receiving the response, the cookie was not present. So something really strange is happening with this site.

I somehow managed to create a simple function to create a request and receive a response. What I am doing is that I first create a GET request and get the site string and extract ViewState, EventValidation, etc. from it. I use this information for use in the second HttpWebRequest, which is mail. Now everything is working fine, and I get the answer, but not as expected. I want the records between the two dates to be dated, and I have specified these dates in the form data, but still the POST request does not return the filtered data. I mentioned the function that I created below, and I would really appreciate any suggestions on why this is happening and how to handle this. Understanding this was a challenge for me, because I can’t understand why this simple site does not appear in Fiddler. (This uses Javascript Postback)

The code may look long and scary, but it is very simple and straightforward.

Try ' First GET Request to obtain Viewstate, Eventvalidation etc Dim objRequest2 As Net.HttpWebRequest = DirectCast(HttpWebRequest.Create("http://www.bseindia.com/corporates/Insider_Trading_new.aspx?expandable=0"), HttpWebRequest) objRequest2.Method = "GET" objRequest2.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8" objRequest2.Headers.Add("Accept-Encoding", "gzip, deflate") objRequest2.Headers.Add("Accept-Language", "en-GB,en-US;q=0.8,en;q=0.6,ur;q=0.4") objRequest2.KeepAlive = True objRequest2.ContentType = "application/x-www-form-urlencoded" objRequest2.Host = "www.bseindia.com" objRequest2.UserAgent = "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36" objRequest2.AutomaticDecompression = DecompressionMethods.Deflate Or DecompressionMethods.GZip Dim LoginRes2 As Net.HttpWebResponse Dim sr2 As IO.StreamReader LoginRes2 = objRequest2.GetResponse() sr2 = New IO.StreamReader(LoginRes2.GetResponseStream) Dim getString As String = sr2.ReadToEnd() Dim getCookieCollection = objRequest2.CookieContainer ' get the page ViewState Dim viewStateFlag As String = "id=""__VIEWSTATE"" value=""" Dim i As Integer = getString.IndexOf(viewStateFlag) + viewStateFlag.Length Dim j As Integer = getString.IndexOf("""", i) Dim viewState As String = getString.Substring(i, j - i) ' get page EventValidation Dim eventValidationFlag As String = "id=""__EVENTVALIDATION"" value=""" i = getString.IndexOf(eventValidationFlag) + eventValidationFlag.Length j = getString.IndexOf("""", i) Dim eventValidation As String = getString.Substring(i, j - i) ' get page EventValidation Dim viewstateGeneratorFlag As String = "id=""__VIEWSTATEGENERATOR"" value=""" i = getString.IndexOf(viewstateGeneratorFlag) + viewstateGeneratorFlag.Length j = getString.IndexOf("""", i) Dim viewStateGenerator As String = getString.Substring(i, j - i) viewState = System.Web.HttpUtility.UrlEncode(viewState) eventValidation = System.Web.HttpUtility.UrlEncode(eventValidation) Dim LoginRes As Net.HttpWebResponse Dim sr As IO.StreamReader Dim objRequest As Net.HttpWebRequest ' Second POST request to post the form data along with cookies objRequest = DirectCast(HttpWebRequest.Create("http://www.bseindia.com/corporates/Insider_Trading_new.aspx?expandable=0"), HttpWebRequest) Dim formDataCollection As New NameValueCollection formDataCollection.Add("__EVENTTARGET", "") formDataCollection.Add("__EVENTARGUMENT", "") formDataCollection.Add("__VIEWSTATE", viewState) formDataCollection.Add("__VIEWSTATEGENERATOR", viewStateGenerator) formDataCollection.Add("__EVENTVALIDATION", eventValidation) formDataCollection.Add("fmdate", "20160104") formDataCollection.Add("eddate", "20160204") formDataCollection.Add("hidCurrentDate", "2016/02/04") formDataCollection.Add("ctl00_ContentPlaceHolder1_hdnCode", "") formDataCollection.Add("txtDate", "04/01/2016") formDataCollection.Add("ddlCalMonthDiv3", "1") formDataCollection.Add("ddlCalYearDiv3", "2016") formDataCollection.Add("txtTodate", "04/02/2016") formDataCollection.Add("ddlCalMonthDiv4", "2") formDataCollection.Add("ddlCalYearDiv4", "2016") formDataCollection.Add("Hidden1", "") formDataCollection.Add("ctl00_ContentPlaceHolder1_GetQuote1_smartSearch", "Enter Security Name / Code / ID") formDataCollection.Add("btnSubmit.x", "44") formDataCollection.Add("btnSubmit.y", "2") Dim strFormdata As String = formDataCollection.ToString() Dim encoding As New ASCIIEncoding Dim postBytes As Byte() = encoding.GetBytes(strFormdata) objRequest.Method = "POST" objRequest.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8" objRequest.Headers.Add("Accept-Encoding", "gzip, deflate") objRequest.Headers.Add("Accept-Language", "en-GB,en-US;q=0.8,en;q=0.6,ur;q=0.4") objRequest.Headers.Add("Cache-Control", "private, max-age=60") objRequest.KeepAlive = True objRequest.ContentType = "application/x-www-form-urlencoded" objRequest.Host = "www.bseindia.com" objRequest.Headers.Add("Origin", "http://www.bseindia.com") objRequest.Referer = "http://www.bseindia.com/corporates/Insider_Trading_new.aspx?expandable=0" objRequest.Headers.Add("Upgrade-Insecure-Requests", "1") objRequest.UserAgent = "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36" objRequest.ContentType = "text/html; charset=utf-8" objRequest.Date = "Thu, 04 Feb 2016 13:42:04 GMT" objRequest.Headers.Add("Server", "Microsoft-IIS/8.0") objRequest.Headers.Add("Vary", "Accept-Encoding") objRequest.Headers.Add("X-AspNet-Version", "2.0.50727") objRequest.Headers.Add("ASP.NET", "ASP.NET") objRequest.AutomaticDecompression = DecompressionMethods.Deflate Or DecompressionMethods.GZip Dim gaCookies As New CookieContainer() Dim cookie1 As New Cookie("__asc", "f673f0d5152a823bc335f575d34") cookie1.Domain = ".bseindia.com" cookie1.Path = "/" gaCookies.Add(cookie1) Dim cookie2 As New Cookie("__auc", "f673f0d5152a823bc335f575d34") cookie2.Domain = ".bseindia.com" cookie2.Path = "/" gaCookies.Add(cookie2) Dim cookie3 As New Cookie("__utma", "253454874.280640365.1454519857.1454519865.1454519865.1") cookie3.Domain = ".bseindia.com" cookie3.Path = "/" gaCookies.Add(cookie3) Dim cookie4 As New Cookie("__utmb", "253454874.1.10.1454519865") cookie4.Domain = ".bseindia.com" cookie4.Path = "/" gaCookies.Add(cookie4) Dim cookie5 As New Cookie("__utmc", "253454874") cookie5.Domain = ".bseindia.com" cookie5.Path = "/" gaCookies.Add(cookie5) Dim cookie6 As New Cookie("__utmt", "1") cookie6.Domain = ".bseindia.com" cookie6.Path = "/" gaCookies.Add(cookie6) Dim cookie7 As New Cookie("__utmz", "253454874.1454519865.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)") cookie7.Domain = ".bseindia.com" cookie7.Path = "/" gaCookies.Add(cookie7) Dim cookie8 As New Cookie("_ga", "GA1.2.280640365.1454519857") cookie8.Domain = ".bseindia.com" cookie8.Path = "/" gaCookies.Add(cookie8) Dim cookie9 As New Cookie("_gat", "1") cookie9.Domain = ".bseindia.com" cookie9.Path = "/" gaCookies.Add(cookie9) Dim postStream As Stream = objRequest.GetRequestStream() postStream.Write(postBytes, 0, postBytes.Length) postStream.Flush() postStream.Close() LoginRes = objRequest.GetResponse() sr = New IO.StreamReader(LoginRes.GetResponseStream) ReadWebsite = sr.ReadToEnd() sr.Close() sr = Nothing LoginRes.Close() LoginRes = Nothing objRequest = Nothing Exit Function Catch ex As Exception ReadWebsite = Nothing End Try 

Note: (data of the original form for dates without viewing the status and event)

fmdate: 20160130 eddate: 20160205 hidCurrentDate: 2016/02/05 ctl00_ContentPlaceHolder1_hdnCode: txtDate: 04/01/2016 ddlCalMonthDiv3: 1 ddlCalYearDiv3: 2016 txtTodate: 04/02/2016 ddlCalMonthDiv4: 2 ddlCalYearDiv4: 2016 Hidden1: ctl00_ContentPlaceHolder1_GetQuote1_smartPoisk: enter the name of safety / code / Identifier btnSubmit.x: 55 btnSubmit.y: 13

+7
web-scraping
source share
2 answers

You may have launched the site in a browser and used a tool to control the browser, instead directly sending GET / POST requests. It may be simpler and slightly more reliable than your current approach.

eg. Selenium Web Driver http://www.seleniumhq.org/projects/webdriver/

You loaded the page, set the values ​​of the form fields (using CSS selectors to find the appropriate fields), and then click the button. You can automate all this and get the page source (unfortunately, I don’t think you can get the full html in the current state after running javascript, but you can potentially use api to get the necessary elements).

Api Documentation: http://seleniumhq.imtqy.com/selenium/docs/api/dotnet/

+2
source share

You really need to include ALL fields from the form, including hidden ones, and the ASP session identifier stored in cookies. Thus, you completely emulate the browser request and achieve your goal. To show what you need to send - http://pastebin.com/AsSABgU6

+1
source share

All Articles