Memory Management / cannot allocate a vector of size n Mb

I'm having trouble trying to use large objects in R. For example:

> memory.limit(4000) > a = matrix(NA, 1500000, 60) > a = matrix(NA, 2500000, 60) > a = matrix(NA, 3500000, 60) Error: cannot allocate vector of size 801.1 Mb > a = matrix(NA, 2500000, 60) Error: cannot allocate vector of size 572.2 Mb # Can't go smaller anymore > rm(list=ls(all=TRUE)) > a = matrix(NA, 3500000, 60) # Now it works > b = matrix(NA, 3500000, 60) Error: cannot allocate vector of size 801.1 Mb # But that is all there is room for 

I understand that this is due to the difficulty of getting contiguous blocks of memory (from here ):

Error messages cannot allocate a size vector, indicate the inability to receive memory, either because the size exceeded the address space for the process or, more likely, because the system was unable to provide memory. Note that on a 32-bit assembly there may well be enough free memory, but not a large enough adjacent block, the address space into which it can be mapped.

How can I get around this? My main difficulty is that I get to a certain point in my script, and R cannot allocate 200-300 MB for an object ... I cannot preallocate a block because I need memory for other processing, This even happens then when I delete unnecessary objects.




Edit : Windows XP SP3, 4 GB RAM, R 2.12.0:

 > sessionInfo() R version 2.12.0 (2010-10-15) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_Caribbean.1252 LC_CTYPE=English_Caribbean.1252 [3] LC_MONETARY=English_Caribbean.1252 LC_NUMERIC=C [5] LC_TIME=English_Caribbean.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base 
+79
memory-management vector matrix r
Mar 02 '11 at 18:13
source share
8 answers

Consider whether you really need all this data, or can a matrix be allowed? There is good support in R (see Matrix Package, for example) for sparse matrices.

Keep all other processes and objects in R to a minimum when you need to create objects of this size. Use gc () to clear unused memory or it is better to create only the object that you need in one session.

If the above does not help, get a 64-bit machine with as much RAM as you can afford, and install the 64-bit version of R.

If you cannot do this, there are many online services for remote computing.

If you cannot do this, memory matching tools such as the ff package (or bigmemory as mentioned by Sascha) will help you build a new solution. In my limited experience, ff is a more complex package, but you should read the High Performance Computing topic in the CRAN task view.

+43
Mar 02 2018-11-11T00:
source share

For Windows users, the following helped me understand some memory limitations:

  • Before opening R, open the Windows Resource Monitor (Ctrl-Alt-Delete / Start Task Manager / Performance tab / click on the bottom "Resource Monitor" / "Memory" tab).
  • You will see how much RAM we already used before opening R and which applications. In my case, 1.6 GB is used, only 4 GB. So I can only get 2.4 GB for R, but now it's getting worse ...
  • open R and create a 1.5 GB dataset and then reduce its size to 0.5 GB, the resource monitor shows that my RAM is almost 95% used.
  • use gc() to collect garbage => it works, I see that memory usage is reduced to 2 GB.

enter image description here

Additional tips that work on my machine:

  • prepare the functions, save them as an RData file, close R, reopen R and load the train functions. The resource manager usually shows lower memory usage, which means that even gc () does not recover all possible memory, and closing / reopening R works best with the maximum available memory .
  • Another trick is to load only the training set (do not load the test set, which can usually be half the size of the train set). The training phase can use the maximum memory (100%), so everything that is available is useful. All this needs to be taken with salt, as I am experimenting with R memory limitations.
+26
Jul 15 '14 at 9:35
source share

Here is a presentation on this topic that may seem interesting:

http://www.bytemining.com/2010/08/taking-r-to-the-limit-part-ii-large-datasets-in-r/

I myself have not tried the things discussed, but the bigmemory package seems very useful

+13
Mar 02 '11 at 18:32
source share

The easiest way to get around this limitation is to switch to 64 bit R.

+10
Mar 03 '11 at 20:14
source share

I had a similar problem and used 2 flash drives as "ReadyBoost". These two disks provided additional 8 GB of memory (for the cache), and it solved the problem and also increased the speed of the entire system. To use Readyboost, right-click on the drive, go to properties and select "ReadyBoost" and select the "use this device" radio button and click "Apply" or "OK" to configure.

+6
Dec 10 '15 at 20:31
source share

If you are using a script in linux environment, you can use this command:

 bsub -q server_name -R "rusage[mem=requested_memory]" "Rscript script_name.R" 

and the server will allocate the memory you requested (in accordance with the server limits, but with a good server - huge files can be used)

+4
Sep 10 '15 at 8:18
source share

I recently ran into the problem of running CARET on a 500-row dataset

It said that he could not isolate the vector from 137 MB. You need to do the following

  • Close processes on your system, especially in the browser
  • Saving necessary frames of R data in csv file
  • Restart R session and load data frames

Warrior, the team worked. It seems that rm () does not free memory in R. Even gc () does not work, as mentioned in one of the threads

-2
Feb 28 '16 at 16:21
source share

Try loading your system memory using / 3G http://msdn.microsoft.com/en-us/windows/hardware/gg487508

-7
Mar 02 '11 at 18:38
source share