Increase memory size limit in R

I have an R program that combines 10 files, each file has a size of 296 MB, and I increased the amount of memory to 8 GB (RAM size)

--max-mem-size=8192M 

and when I ran this program I got an error

 In type.convert(data[[i]], as.is = as.is[i], dec = dec, na.strings = character(0L)) : Reached total allocation of 7646Mb: see help(memory.size) 

Here is my program R

 S1 <- read.csv2("C:/Sim_Omega3_results/sim_omega3_1_400.txt"); S2 <- read.csv2("C:/Sim_Omega3_results/sim_omega3_401_800.txt"); S3 <- read.csv2("C:/Sim_Omega3_results/sim_omega3_801_1200.txt"); S4 <- read.csv2("C:/Sim_Omega3_results/sim_omega3_1201_1600.txt"); S5 <- read.csv2("C:/Sim_Omega3_results/sim_omega3_1601_2000.txt"); S6 <- read.csv2("C:/Sim_Omega3_results/sim_omega3_2001_2400.txt"); S7 <- read.csv2("C:/Sim_Omega3_results/sim_omega3_2401_2800.txt"); S8 <- read.csv2("C:/Sim_Omega3_results/sim_omega3_2801_3200.txt"); S9 <- read.csv2("C:/Sim_Omega3_results/sim_omega3_3201_3600.txt"); S10 <- read.csv2("C:/Sim_Omega3_results/sim_omega3_3601_4000.txt"); options(max.print=154.8E10); combine_result <- rbind(S1,S2,S3,S4,S5,S6,S7,S8,S9,S10) write.table(combine_result,file="C:/sim_omega3_1_4000.txt",sep=";", row.names=FALSE,col.names=TRUE, quote = FALSE); 

Can anyone help me with this

Thanks,

Shruti.

+2
source share
4 answers

I suggest including sentences in ?read.csv2 :

Memory usage:

  These functions can use a surprising amount of memory when reading large files. There is extensive discussion in the 'R Data Import/Export' manual, supplementing the notes here. Less memory will be used if 'colClasses' is specified as one of the six atomic vector classes. This can be particularly so when reading a column that takes many distinct numeric values, as storing each distinct value as a character string can take up to 14 times as much memory as storing it as an integer. Using 'nrows', even as a mild over-estimate, will help memory usage. Using 'comment.char = ""' will be appreciably faster than the 'read.table' default. 'read.table' is not the right tool for reading large matrices, especially those with many columns: it is designed to read _data frames_ which may have columns of very different classes. Use 'scan' instead for matrices. 
+6
source

Memory allocation requires adjacent blocks. The size taken by the file on disk may not be a good indicator of how large the object is when loading into R. You can look at one of these S files using the function:

 ?object.size 

Here is the function I use to see which takes up the most space in R:

 getsizes <- function() {z <- sapply(ls(envir=globalenv()), function(x) object.size(get(x))) (tmp <- as.matrix(rev(sort(z))[1:10]))} 
+3
source

If you remove(S1,S2,S3,S4,S5,S6,S7,S8,S9,S10) then gc() after calculating comb_result, you can free up enough memory. I also found that running it through RScript seems to allow you to access more memory than through the GUI if you are on Windows.

+1
source

If these files are in a standard format, and you want to do it in R, then why read read / write csv. Use readLines / writeLines :

 files_in <- file.path("C:/Sim_Omega3_results",c( "sim_omega3_1_400.txt", "sim_omega3_401_800.txt", "sim_omega3_801_1200.txt", "sim_omega3_1201_1600.txt", "sim_omega3_1601_2000.txt", "sim_omega3_2001_2400.txt", "sim_omega3_2401_2800.txt", "sim_omega3_2801_3200.txt", "sim_omega3_3201_3600.txt", "sim_omega3_3601_4000.txt")) file.copy(files_in[1], out_file_name <- "C:/sim_omega3_1_4000.txt") file_out <- file(out_file_name, "at") for (file_in in files_in[-1]) { x <- readLines(file_in) writeLines(x[-1], file_out) } close(file_out) 
+1
source

All Articles