Matlab Preallocation, guess a large array or a little?

According to this question , I have to try to use Preallocation - is Matlab.

Now I have a situation where I can not calculate the exact size of the matrix for frontloading. I can guess the size.

assume that the actual size of the matrix is ​​100, but I do not know. Sh

Which scenario is more effective:

  • Should I be generous? I guess that most of the matrix, and in the end I remove the extra lines.
  • Do I have to be mean? I guess the small size, and if it was wrong, I add a new line.

Thanks.

+4
source share
3 answers

In my opinion, the answer is a bit more complicated than shown @natan. I think there are no two factors in his answer:

  • Possible copies, if you evaluate the size of the matrix and redistribute it, all its old values ​​should be copied to the new location selected.

  • Continuity of memory fragments sometimes Matlab can permanently allocate new memory at the end of the old matrix. In principle, in such a scenario, the old value does not need to be copied to the new location, since it is the same as the old one. However, if you add the line to the 2D-matrix, the contents should be copied, even in this scenario, since Matlab stores matrices in the main line in the memory.

So, my answer is:

First of all, what you do not know the size of the matrix: if you know one dimension - make it your number of rows of the matrix, so you will only need to change the number of columns. Thus, if your already saved data to be copied, they will be copied into large pieces.

Secondly, it depends on how much free memory you have at your disposal. If you do not have enough RAM, then there is nothing wrong in the assessment.

However, if you are behind in the memory, think of the evaluation. BUT when re-allocation, increase the size of the new unit at each iteration:

BASIC_SIZE = X; % first estimate NEW_SIZE = Y; % if need more, add this amount factor = 2; arr = zeros( m, BASIC_SIZE ); % first allocation, assuming we know number of rows while someCondition % process arr ... if needMoreCols arr(:, size(arr,2) + (1:NEW_SIZE) ) = 0; % allocate another block NEW_SIZE = round(NEW_SIZE * factor); % it seems like we are off in estimation, try larger chunk next time factor should be > 1 end end arr = arr(:, 1:actualNumOfCols ); % resize to actual size, discard unnecessary columns 
+3
source

+1 for an interesting question.

EDITED Answer: From a small pilot study at first it seems that it is better to add a line later, but now he seems to be more effective and can be reassigned again, when you have some information about the right size. I started with a matrix size of 3000 and proposed a 10% error in the assessment of the size, see below.:

  clear all clc guess_size=3000; m=zeros(guess_size); %1. oops overesrimated, take out rows tic m(end-300:end,:)=[]; toc %1b. oops overesrimated, preallocate again tic m=zeros(guess_size-300,guess_size); toc %2. oops overesrimated, take out cols m=zeros(guess_size); tic m(:,end-300:end)=[]; toc %2b. oops overesrimated, preallocate again m=zeros(guess_size); tic m=zeros(guess_size,guess_size-300); toc %3. oops underesrimated, add rows m=zeros(guess_size); tic m=zeros(guess_size+300,guess_size); toc %4. oops underesrimated, add cols m=zeros(guess_size); tic m=zeros(guess_size,guess_size+300); toc Elapsed time is 0.041893 seconds. Elapsed time is 0.026925 seconds. Elapsed time is 0.041818 seconds. Elapsed time is 0.023425 seconds. Elapsed time is 0.027523 seconds. Elapsed time is 0.029509 seconds. ;  clear all clc guess_size=3000; m=zeros(guess_size); %1. oops overesrimated, take out rows tic m(end-300:end,:)=[]; toc %1b. oops overesrimated, preallocate again tic m=zeros(guess_size-300,guess_size); toc %2. oops overesrimated, take out cols m=zeros(guess_size); tic m(:,end-300:end)=[]; toc %2b. oops overesrimated, preallocate again m=zeros(guess_size); tic m=zeros(guess_size,guess_size-300); toc %3. oops underesrimated, add rows m=zeros(guess_size); tic m=zeros(guess_size+300,guess_size); toc %4. oops underesrimated, add cols m=zeros(guess_size); tic m=zeros(guess_size,guess_size+300); toc Elapsed time is 0.041893 seconds. Elapsed time is 0.026925 seconds. Elapsed time is 0.041818 seconds. Elapsed time is 0.023425 seconds. Elapsed time is 0.027523 seconds. Elapsed time is 0.029509 seconds. 

Options 2b and 1b is slightly faster than an underestimation, so if you can, it is better to overestimate, and then pre-redirect. Never delete rows from the array. Furthermore, the addition of columns appears somewhat more effective, but it's just a quick and dirty job. See. @Shai detailed response to the inner workings of ...

+2
source

In addition to other student's answer is short short version: There are three cases:

  • The array size is relatively small (up to thousands of bytes) → it does not really matter.
  • An array of large, but do not limit the amount of RAM in your system → Overestimate.
  • An array of big and you limit the amount of RAM in your system → do what you suggested Shay.
+2
source

All Articles