Is there a better vectorization technique than this?

I am trying to find out if there are other ways to code this sample code more efficiently. Here y is the 1xM matrix (say, 1x1000), and z is the NxM matrix (for example, 5x1000).

mean(ones(N,1)*y.^3 .* z,2) 

This code works fine, but I'm worried if N increases a lot, that ones(N,1)*y.^3 might get too wasteful and make things slow down.

Thoughts?

+4
source share
1 answer

This is not so bad for a small matrix. Many times you can benefit from using bsxfun in such problems. Here the matrices are simply too small to really get anything.

 >> N = 5;M =1000; >> y = rand(1,M); >> z = rand(N,M); >> mean(ones(N,1)*y.^3 .* z,2) ans = 0.12412 0.11669 0.12102 0.11976 0.12196 >> mean(bsxfun(@times,y.^3,z),2) ans = 0.12412 0.11669 0.12102 0.11976 0.12196 >> z*y.'.^3/M ans = 0.12412 0.11669 0.12102 0.11976 0.12196 

As you can see, all three solutions return the same result. All are equally fair.

Now I will compare the required time.

 >> timeit(@() mean(ones(N,1)*y.^3 .* z,2)) ans = 0.00023018 >> timeit(@() mean(bsxfun(@times,y.^3,z),2)) ans = 0.00026829 >> timeit(@() z*y.'.^3/M) ans = 0.00016594 

As I said, you win a little. In fact, bsxfun does not receive at all, and even a little slower. But you can win a little if you rewrite the expression in the third form that I put. A little, but a little.

Edit: if N is large, then the timing will change slightly.

 >> N = 2000;M = 1000; >> y = rand(1,M); >> z = rand(N,M); >> timeit(@() mean(ones(N,1)*y.^3 .* z,2)) ans = 0.034664 >> timeit(@() mean(bsxfun(@times,y.^3,z),2)) ans = 0.012234 >> timeit(@() z*y.'.^3/M) ans = 0.0017674 

The difference is that the first solution explicitly creates an extended matrix y. ^ 3. It is inefficient.

The bsxfun solution is better because it never explicitly forms the decomposition matrix y. ^ 3. But it still forms a matrix of products, which is N on M. Thus, this solution still needs to capture and fill a large chunk of memory.

You must understand why matrix vector multiplication is best in all cases. Not a single large matrix is ​​ever formed. Since * is simply a point product (thus, the sum of the pieces), it should be more effective. Then I divide by M after the fact to create the desired average.

A slight improvement over the last ...

 >> timeit(@() z*(y.*y.*y).'/M) ans = 0.0015793 

which wins a bit over the degree of op.

And timeit ? This comes from file sharing, a terribly useful utility written by Steve Eddins for fragments of time code.

+5
source

All Articles