Row reading from HTTPS connection to R

When a connection is created using open="r", it allows you to read in turn, which is useful for batch processing of large data streams. For example, this script parses the significant gzipped JSON HTTP stream by reading 100 lines at a time. However, unfortunately, R does not support SSL:

> readLines(url("https://api.github.com/repos/jeroenooms/opencpu"))
Error in readLines(url("https://api.github.com/repos/jeroenooms/opencpu")) : 
  cannot open the connection: unsupported URL scheme

Packages RCurlalso httrsupport HTTPS, but I don’t think they can create a connection object similar to url(). Is there another way to take turns reading an HTTPS connection, similar to the example in the script above?

+4
source share
2 answers

, curl pipe. , .

library(jsonlite)
stream_https <- gzcon(pipe("curl https://jeroenooms.imtqy.com/files/hourly_14.json.gz", open="r"))
batches <- list(); i <- 1
while(length(records <- readLines(gzstream, n = 100))){
  message("Batch ", i, ": found ", length(records), " lines of json...")
  json <- paste0("[", paste0(records, collapse=","), "]")
  batches[[i]] <- fromJSON(json, validate=TRUE)
  i <- i+1
}
weather <- rbind.pages(batches)
rm(batches); close(gzstream)

, curl . RCurl/libcurl.

0

, RCurl " ". , , . write ( headerfunction ), , , libcurl . , . RCurl.

curlPerform(url = "http://www.omegahat.org/index.html", 
            writefunction = function(txt, ...) { 
                                 cat("*", txt, "\n")
                                 TRUE
                            })
+2

All Articles