Performing vectorization of the code to create a sparse matrix with one 1 row from the index vector

I have a large column vector ycontaining integer values ​​from 1 to 10. I wanted to convert it to a matrix, where each row is filled with 0, except for 1 in the index given by the value in the corresponding row y.

This example should make it clearer:

y = [3; 4; 1; 10; 9; 9; 4; 2; ...]

% gets converted to:

Y = [
    0 0 1 0 0 0 0 0 0 0;
    0 0 0 1 0 0 0 0 0 0;
    1 0 0 0 0 0 0 0 0 0;
    0 0 0 0 0 0 0 0 0 1;
    0 0 0 0 0 0 0 0 1 0;
    0 0 0 0 0 0 0 0 1 0;
    0 0 0 1 0 0 0 0 0 0;
    0 1 0 0 0 0 0 0 0 0;
    ...
    ]

I wrote the following code for this (it works):

m = length(y);
Y = zeros(m, 10);
for i = 1:m
    Y(i, y(i)) = 1;
end

I know that there are ways to remove the for loop in this code (vectorization). This post contains several, including something like:

Y = full(sparse(1:length(y), y, ones(length(y),1)));

y double, , 3 , "", 10.000.000 y.

  • , y? , ( MATLAB), , , .

  • ? , , , ints - , sparse .

+4
3

comapre:

function [t,v] = testIndicatorMatrix()
    y = randi([1 10], [1e6 1], 'double');
    funcs = {
        @() func1(y);
        @() func2(y);
        @() func3(y);
        @() func4(y);
    };

    t = cellfun(@timeit, funcs, 'Uniform',true);
    v = cellfun(@feval, funcs, 'Uniform',false);
    assert(isequal(v{:}))
end
function Y = func1(y)
    m = numel(y);
    Y = zeros(m, 10);
    for i = 1:m
        Y(i, y(i)) = 1;
    end
end

function Y = func2(y)
    m = numel(y);
    Y = full(sparse(1:m, y, 1, m, 10, m));
end

function Y = func3(y)
    m = numel(y);
    Y = zeros(m,10);
    Y(sub2ind([m,10], (1:m).', y)) = 1;
end

function Y = func4(y)
    m = numel(y);
    Y = zeros(m,10);
    Y((y-1).*m + (1:m).') = 1;
end

:

>> testIndicatorMatrix
ans =
    0.0388
    0.1712
    0.0490
    0.0430

for JIT ( , )!

+3

, Y . , -

m = numel(y);
Y1(m,10) = 0; %// Faster way to pre-allocate zeros than using function call `zeros`
  %// Source - http://undocumentedmatlab.com/blog/preallocation-performance
linear_idx = (y-1)*m+(1:m)'; %//'# since y is mentioned as a column vector, 
                              %// so directly y can be used instead of y(:)
Y1(linear_idx)=1; %// Y1 would be the desired output

Amro benchmark post -

y = randi([1 10], [1.5e6 1], 'double');

, , , , Y(m,10)=0; Y = zeros(m,10);, -

>> testIndicatorMatrix
ans =
    0.1798
    0.4651
    0.1693
    0.1457

, vectorized approach ( ), , 15% for-loop ( ). , , ( ).

+1

- ?

tic;
N = 1e6;
y = randperm( N );
Y = spalloc( N, N, N );
inds = sub2ind( size(Y), y(:), (1:N)' );
Y = sparse( 1:N, y, 1, N, N, N );
toc

0.144683 .

0

All Articles