Transpose to BLAS or do it yourself?

I am compiling some scientific code in Fortran 77 and I have a discussion about what will be faster.

Basically, I have a matrix MxN, let's call it A. M is greater than N. Later in the code I need to multiply the transpose (A) by a bunch of vectors.

My question is, would it be faster to take A, transfer it yourself and save it, or when I call BLAS, just give it a transpose flag?

Thanks! -Patrick

+4
source share
1 answer

my brush feeling tells me to use the transpose flag. in this case, you make many point products in one step.

In fact, it is very difficult to say without really working codes. modern blas uses cache blocking methods, which at best make it difficult to analyze.

+6
source

All Articles