I am trying a webscrape table with an aspx interactive webpage. I read all the R webscraping R questions on the stack, and I think I'm getting closer, but I can't seem to get it.
I would like to extract data from the tables created here . In the end, I would like to skip every date and state, but my task is just to get to R to present my parameters and pull out the summary table for any particular query.
From what I am compiling, the answer probably includes RCurl and XML packages, placing the βformβ with my parameters, and then reading the resulting page in html.
My last effort is as follows:
library(RCurl) library(XML) curl = getCurlHandle() link = "http://indiawater.gov.in/IMISReports/Reports/WaterQuality/rpt_WQM_HabitationWiseLabTesting_S.aspx" html = getURL(link, curl = curl) params = list('ctl00$ContentPlaceHolder$ddFinYear' = '2005-2006', 'ctl00$ContentPlaceHolder$ddState' = 'BIHAR') html2 = postForm(link, .params = params, curl = curl) table = readHTMLTable(html2 )
It's hard for me to say at what point I ran into a problem. On the one hand, html == html2 creates false, so I think html2 has progressed to some point after the form was submitted, but it is still not clear to me if the form was submitted incorrectly or if it worked, and reading it into a table, which does not work.
Any suggestion or help is appreciated. Thanks!
r web-scraping rcurl
Daedalus bloom
source share