When not to vectorize matlab?

I am working on some Matlab code that processes large (but not huge) data sets: 10,000,784 element vectors (not sparse) and calculates information about what is stored in a 10,000 × 10 sparse matrix. To make the code work, I did some of the more complex elements iteratively, making loops over 10k elements to process them, and several cycles of 10 elements in a sparse matrix for cleaning.

My process initially performed 73 iterations (of the order of 730k, for example) for processing and took about 120 seconds to complete. Not bad, but it's Matlab, so I decided to vectorize it to speed it up.

In the end, I have a fully vectorized solution that gets the same answer (so it fixes, or at least as correctly as my original solution), but takes 274 seconds, it is almost twice as fast!

This is the first time I've come across matlab code that runs slower than vectorized than iteratively. Are there any rules of thumb or guidelines for identification when possible / possible?

I would like to share the code for some reviews, but this is for the current open school assignment, so I really can't right now. If this turns out to be one of those “Wow, this is strange, you probably did something wrong,” I will probably change my mind in a week or two to see how my vectorization is somehow turned off.

+5
source share
4 answers

Vectorization in Matlab often means allocating a much larger amount of memory (creating a much larger array to avoid a loop, such as a trick trick ). With improved JIT compilation of loops in recent versions - it is possible that the memory allocation needed for your vectorized solution means that there is no advantage, but without seeing code that is hard to say. Matlab has an excellent linear profiler that should help you understand which specific parts of the billable version are taking time.

+9
source

( [ 784] [ 10 000])? -; , , , , : Execution time comparison between vectorized and unvectorized implementations of the Gram-Schmidt orthogonalization algorithm

script:

clgs.m

function [Q,R] = clgs(A)
% QR factorization by unvectorized classical Gram-Schmidt orthogonalization

[m,n] = size(A);

R = zeros(n,n);     % pre-allocate upper-triangular matrix

% iterate over columns
for j = 1:n
    v = A(:,j);

    % iterate over remaining columns
    for i = 1:j-1
        R(i,j) = A(:,i)' * A(:,j);
        v = v - R(i,j) * A(:,i);
    end

    R(j,j) = norm(v);
    A(:,j) = v / norm(v);   % normalize
end
Q = A;

clgs2.m

function [Q,R] = clgs2(A)
% QR factorization by classical Gram-Schmidt orthogonalization with a
% vectorized inner loop

[m,n] = size(A);
R = zeros(n,n);     % pre-allocate upper-triangular matrix

for k=1:n
    R(1:k-1,k) = A(:,1:k-1)' * A(:,k);
    A(:,k) = A(:,k) - A(:,1:k-1) * R(1:k-1,k);
    R(k,k) = norm(A(:,k));
    A(:,k) = A(:,k) / R(k,k);
end

Q = A;

benchgs.m

n = [300,350,400,450,500];

clgs_time=zeros(length(n),1);
clgs2_time=clgs_time;

for i = 1:length(n)
    A = rand(n(i));
    tic;
    [Q,R] = clgs(A);
    clgs_time(i) = toc;

    tic;
    [Q,R] = clgs2(A);
    clgs2_time(i) = toc;
end

semilogy(n,clgs_time,'b',n,clgs2_time,'r')
xlabel 'n', ylabel 'Time [seconds]'
legend('unvectorized CGS','vectorized CGS')
+7

" MATLAB" :

, . ,

  • , , , .
  • , .
+1

, (4D ).

, , 4D. , , 5D. (, , , 4D-, , , 5- , ).

, , 5D-, , -, - , , 4D .

"", , , matlab - " ". , , , 1- , , , , . , , , , , .

, :

>> A = randn(10000,10000);
>> tic; for n = 1 : 100; sum(A,1); end; toc
Elapsed time is 12.354861 seconds.
>> tic; for n = 1 : 100; sum(A,2); end; toc
Elapsed time is 22.298909 seconds.
0

All Articles