R: readBin workaround to character limit (10,000 bytes)?

I have a file that consists of an XML character header and binary data, which is then read using readBinR in:

zz <- file('myfile', 'rb')

# Read header
x <- readBin(zz, 'character')

# Read binary data
... 

However, when the header exceeds 10,000 bytes, I get the following:

Warning message:
 In readBin(zz, 'character') :
 null terminator not found: breaking string at 10000 bytes

I tried to loop until the line matches the end of the header and then concatenates the lines together, but then the XML will not be checked, as some parts have damaged endings (for example, \xa0W\x97^\xff\177added at the end).

How should I work with a character prefix readBin- are there any simple workarounds?

Any suggestions are welcome. Thank!

UPDATE

The following is a reproducible example:

url <- 'http://www.enetpulse.com/wp-content/uploads/sample_xml_feed_enetpulse_icehockey.xml'
x <- paste(readLines(url), collapse = '\n')  # more than 10 000 bytes

f <- tempfile()
zz <- file(f, 'wb')
writeBin(x, zz)  # header
writeBin(1:10000, zz)  # data
close(zz)

# readBin
zz <- file(f, 'rb')
y <- readBin(zz, 'character')
# Warning message:
# In readBin(zz, "character") :
#   null terminator not found: breaking string at 10000 bytes
y
# "... participantFK=\"98707\" [\x97^\xff\177"
close(zz)

# readChar
zz <- file(f, 'rb')
readChar(zz, nchars = 999999)
# Error in readChar(zz, nchars = 999999) : 
#   invalid UTF-8 input in readChar()
close(zz)

# readBin-loop
library(XML)
p <- xmlParse(x)  # it works to parse the original xml
zz <- file(f, 'rb')
fun <- function(x) readBin(zz, 'character')
res <- paste(sapply(1:4, fun), collapse = '')
p2 <- xmlParse(res)  # errors!
+4
1

Ok. . . , . , , , . , .

, ,

block <- 256*4
zz <- file(f, 'rb')
rr <- raw()
found <- 0
while ( found==0 ) {
    r <- readBin(zz, "raw", block)
    if( length(w<-head(which(r==0),1)) ) {
        rr <- c(rr, r[1:(w-1)])
        found <- 1
        seek(zz, -(block-w), origin="current") #rewind
    } else {
        rr <- c(rr, r)
    }
}

library(XML)
p <- xmlParse(rawToChar(rr), asText=TRUE)
dd <- readBin(zz, "integer",10000)
close(zz)

XML p dd.

, . , . .

+2

All Articles