Avoiding memory limits in R

I try to replace the values ​​in the matrix, in particular "t" β†’ 1 and "f" β†’ 0, but I keep getting error messages:

Error: cannot allocate vector of size 2.0 Mb ... Reached total allocation of 16345Mb: see help(memory.size) 

I am using a Win7 computer with 16 GB of memory in the 64-bit version of R in RStudio.

what I'm doing now

 a <- matrix( dataset, nrow=nrow(dataset), ncol=ncol(dataset), byrow=TRUE) memory.size() a[a=="t"] <- 1 

where the data set is a data frame of size (about) 525000x300. the memory.size() gives me less than 4 GB and memory.limit() memory.size() gives me 16 GB. Why does this line require so much memory to execute? Is there a way to make a replacement without clicking on the memory limit (and are there any good tips to fix it in general), and if so, will it cost me a lot of time to run it? I'm still pretty new to R, so I don't know if this matters depending on the data class used and how R allocates memory ...

+4
source share
1 answer

when you call this line

 a[a=="t"] <- 1 

R needs to create a whole new boolean matrix for indexing in a. If a is huge, this Boolean matrix will also be huge.

Perhaps you can try working with smaller sections of the matrix, rather than trying to do it all in one shot.

 for (i in 1:ncol(a)){ ix = (a[:,i] == "t") a[ix,i] = 1 } 

It is not fast or elegant, but it can get around the memory issue.

+2
source

All Articles