I think there is no good R tool for reading a file randomly (maybe it could be an extension read.tableor fread(package data.table)).
Using perl, you can easily complete this task. For example, to read 1% of your file randomly, you can do this:
xx= system(paste("perl -ne 'print if (rand() < .01)'",big_file),intern=TRUE)
Here I call it from R using system. xx now only contains 1% of your file.
You can wrap it all in a function:
read_partial_rand <-
function(big_file,percent){
cmd <- paste0("perl -ne 'print if (rand() < ",percent,")'")
cmd <- paste(cmd,big_file)
system(cmd,intern=TRUE)
}
source
share