Fast matrix multiplication calculation algorithm

In the middle of C ++ code, eclipse, I need to calculate the multiplication of the matrices A and B with a size of 2400 * 3600 (so the sizes do not match). Matrices are stored in two-dimensional floating-point arrays. They are not sparse, without restriction.

Each multiplication takes so much time (several minutes), and I seriously need to reduce it, because I have a cycle that repeats 50 million times. and every time you need to multiply the new A and B. Any recommendations are recommended to reduce the time complexity. (even if you change the structure of data storage, if you think this can help). For example, what if I store data in one-dimensional arrays? Or using vectors instead of arrays?

In one specific case, the first column is always 1, and the values โ€‹โ€‹are 1, -1, or zero. Any idea for this case?
In other cases, the values โ€‹โ€‹can be any. ** One of these multiplications of X is multiplied by its transposition. Are there any recommendations on this particular one?

+7
source share
4 answers

I wouldnโ€™t deceive myself, trying to write my own: Google for LAPACK or BLAS, two time-tested packages for numerical calculations, both optimized for the Nth degree. Both have C APIs that you can use.

+13
source

This will certainly help to keep the second transposed matrix so that the columns correspond to the cache lines instead of lines. The difference in access time between the L2 cache and main memory is 10 or so.

+8
source

You can try Eigen to try.

+2
source

If you are talking about millions of multiplications, the first thing I would like to do is turn to something like CUDA or DirectCompute to turn off the GPU, which is much better for this kind of thing. This is what MATLAB did, even if GPU acceleration is optional.

There are many examples of multiplying matrix multipliers by GPUs, so your work should not be too complicated.

+1
source

All Articles