I am having trouble retrieving data from the following website. If I go to long_url through my browser, I can see the table that I want to clear, but if I call the url from R using httr, I either do not receive the data returned to me, or I do not understand how it is returned to me.
base_url <- "http://web1.ncaa.org/stats/exec/records"
long_url <- "http://web1.ncaa.org/stats/exec/records?academicYear=2014&sportCode=MFB&orgId=721"
library(XML)
library(httr)
library(rvest) # devtools::install_github("hadley/rvest")
The results of these POST requests look identical to me,
doc <- POST(base_url, query = list(academicYear = "2014", sportCode = "MFB",
orgId = "721"))
doc <- POST(long_url)
class(doc)
Both POST requests return a status code of 200, and the doc class is "HTMLInternalDocument" and "XMLInternalDocument", which is a regular R object that allows me to clear pages. But then the following rvest and XML functions look empty, although I know there is a table on url.
table <- html_nodes(doc, css = "td")
table <- readHTMLTable(doc)
- , httr- ? GET- .