R - increase memory consumption

My code is as follows (this is a slightly simplified version compared to the original, but it still reflects the problem).

require(VGAM) Median.sum = vector(mode="numeric", length=75) AA.sum = vector(mode="numeric", length=75) BB.sum = vector(mode="numeric", length=75) Median = array(0, dim=c(75 ,3)) AA = array(0, dim=c(75 ,3)) BB = array(0, dim=c(75 ,3)) y.sum = vector(mode="numeric", length=100000) y = array(0, dim=c(100000,3)) b.size = vector(mode="numeric", length=3) c.size = vector(mode="numeric", length=3) for (h in 1:40) { for (j in 1:75) { for (i in 1:100000) { y.sum[i] = 0 for (f in 1:3) { b.size[f] = rbinom(1, 30, 0.9) c.size[f] = 30 - rbinom(1, 30, 0.9) + 1 y[i, f] = sum( rlnorm(b.size[f], 8.5, 1.9) ) + sum( rgpd(c.size[f], 120000, 1870000, 0.158) ) y.sum[i] = y.sum[i] + y[i, f] } } Median.sum[j] = median(y.sum) AA.sum[j] = mean(y.sum) BB.sum[j] = quantile(y.sum, probs=0.85) for (f in 1:3) { Median[j,f] = median(y[,f]) AA[j,f] = mean(y[,f]) BB[j,f] = quantile(y[,f], probs=0.85) } } #gc() } 

It splits in the middle of execution (h = 7, j = 1, i = 93065) with an error:

 Error: cannot allocate vector of size 526.2 Mb 

Immediately after receiving this message, I read this , this and, but this is still not enough. The fact is that neither the garbage collector (gc ()) nor the cleaning of all objects from the workspace help. I mean, I tried to include both the garbage collector and the operation in my code, deleting all the variables and declaring them again in a loop (see where #gc () is, however the latter is not included in the code that I posted )

It seems strange to me, since the whole procedure uses the same objects at each step of the cycle (=> and should consume the same amount of memory at each step of the cycle). Why does memory consumption increase over time?

To make the question worse, if I want to work in one R session and even execute:

 rm(list=ls()) gc() 

I still get the same error message even if I want to declare something like:

 abc = array(0, dim=c(10,3)) 

Only closing R and starting a new session helps. What for? Maybe there is a way to copy my loop?

R: 2.15.1 (32-bit), OS: Windows XP (32-bit)

I am completely new here, so every review was appreciated! Thanks in advance.


Edit: (From Arun). I find that this behavior is even easier to reproduce with a simple example. Start a new R session and copy and paste this code and see how memory grows on your system monitor.

 mm <- rep(0, 1e4) # initialise a vector for (i in 1:1e3) { for (j in 1:1e3) { for (k in 1:1e4) { mm[k] <- k # already pre-allocated } } } 
+7
source share
2 answers

This seems to work (introducing an inner loop into the function). I did not run it to the end, because it had to slow down, but I did not notice memory inflation, as in your code.

 require(VGAM) Median.sum = vector(mode="numeric", length=75) AA.sum = vector(mode="numeric", length=75) BB.sum = vector(mode="numeric", length=75) Median = array(0, dim=c(75 ,3)) AA = array(0, dim=c(75 ,3)) BB = array(0, dim=c(75 ,3)) inner.fun <- function() { y.sum = vector(mode="numeric", length=100000) y = array(0, dim=c(100000,3)) b.size = vector(mode="numeric", length=3) c.size = vector(mode="numeric", length=3) for (i in 1:100000) { y.sum[i] = 0 for (f in 1:3) { b.size[f] = rbinom(1, 30, 0.9) c.size[f] = 30 - rbinom(1, 30, 0.9) + 1 y[i, f] = sum( rlnorm(b.size[f], 8.5, 1.9) ) + sum( rgpd(c.size[f], 120000, 1870000, 0.158) ) y.sum[i] = y.sum[i] + y[i, f] } } list(y.sum, y) } for (h in 1:40) { cat("\nh =", h,"; j = ") for (j in 1:75) { cat(j," ") result = inner.fun() y.sum = result[[1]] y = result[[2]] Median.sum[j] = median(y.sum) AA.sum[j] = mean(y.sum) BB.sum[j] = quantile(y.sum, probs=0.85) for (f in 1:3) { Median[j,f] = median(y[,f]) AA[j,f] = mean(y[,f]) BB[j,f] = quantile(y[,f], probs=0.85) } } } 
+2
source

Add a call to gc() in the for (i in 1:100000) loop.

Adding a call to gc() in a narrow loop of Arun code reduces its memory footprint.

This shows memory growth:

 mm <- rep(0, 1e4) # initialise a vector for (i in 1:1e3) { for (j in 1:1e3) { for (k in 1:1e4) { mm[k] <- k # already pre-allocated } } } 

It does not mean:

 mm <- rep(0, 1e4) # initialise a vector for (i in 1:1e3) { for (j in 1:1e3) { for (k in 1:1e4) { mm[k] <- k # already pre-allocated gc() } } } 

Something related to automatic garbage collection. The collector is called in the first case, as pointed out by gcinfo(TRUE) . But memory is growing fast.

+4
source

All Articles