R: reading a binary file that is zipped

I am trying to analyze some weather data on the Internet using R. The data is a binary file that was gzipped. Example file:

ftp://ftp.cpc.ncep.noaa.gov/precip/CPC_UNI_PRCP/GAUGE_GLB/V1.0/2005/PRCP_CU_GAUGE_V1.0GLB_0.50deg.lnx.20050101.gz

If I download the file to my computer and manually unzip it, I can easily do the following:

  myFile <- ( "/tmp/PRCP_CU_GAUGE_V1.0GLB_0.50deg.lnx.20050101" ) to.read = file( myFile, "rb") myPoints <- readBin(to.read, real(), n=1e6, size = 4, endian = "little") 

What I would like to do is automate loading / unpacking along with reading. So I thought it would be as simple as the following:

 p <- gzcon( url( "ftp://ftp.cpc.ncep.noaa.gov/precip/CPC_UNI_PRCP/GAUGE_GLB/V1.0/2005/PRCP_CU_GAUGE_V1.0GLB_0.50deg.lnx.20050101.gz" ) ) myPoints <- readBin(p, real(), n=1e6, size = 4, endian = "little") 

This seems to work just like a dandy, but in the manual step the vector myPoints has a length of 518400, which is accurate. However, if R handles the load and reads, as in the second example, every time I run the code, I get a separate length vector. Jokes aside. I do not smoke anything. I swear. I run it several times and each time the vector is different from the length, it is always less than the expected 518400.

I also tried to get R to download the gzip file using the following:

 temp <- tempfile() myFile <- download.file("ftp://ftp.cpc.ncep.noaa.gov/precip/CPC_UNI_PRCP/GAUGE_GLB/V1.0/2005/PRCP_CU_GAUGE_V1.0GLB_0.50deg.lnx.20050101.gz",temp) 

I found that often this returned a warning that the file was not expected. As below:

 Warning message: In download.file("ftp://ftp.cpc.ncep.noaa.gov/precip/CPC_UNI_PRCP/GAUGE_GLB/V1.0/2005/PRCP_CU_GAUGE_V1.0GLB_0.50deg.lnx.20050101.gz", : downloaded length 162176 != reported length 179058 

Any tips you can give me to help me solve this problem?

-J

+7
source share
1 answer

Try the following:

 R> remfname <- "ftp://ftp.cpc.ncep.noaa.gov/precip/CPC_UNI_PRCP/GAUGE_GLB/V1.0/2005/PRCP_CU_GAUGE_V1.0GLB_0.50deg.lnx.20050101.gz" R> locfname <- "/tmp/data.gz" R> download.file(remfname, locfname) trying URL 'ftp://ftp.cpc.ncep.noaa.gov/precip/CPC_UNI_PRCP/GAUGE_GLB/V1.0/2005/PRCP_CU_GAUGE_V1.0GLB_0.50deg.lnx.20050101.gz' ftp data connection made, file length 179058 bytes opened URL ================================================== downloaded 174 Kb R> con <- gzcon(file(locfname, "rb")) R> myPoints <- readBin(con, real(), n=1e6, size = 4, endian = "little") R> close(con) R> str(myPoints) num [1:518400] 0 0 0 0 0 0 0 0 0 0 ... R> 
+2
source

All Articles