In your other post, you said the following code would take 0.9 seconds.
MatrixXd A = MatrixXd::Random(1000, 1000); MatrixXd B = MatrixXd::Random(1000, 500); MatrixXd X;
I tried a little test on my machine, i2 i7 on Linux. My complete test code is as follows:
I just use the time command from linux, so startup time includes starting and stopping the executable.
1 / Compilation without optimization (gcc compiler):
g++ -I/usr/include/eigen3 matcal.cpp -O0 -o matcal time ./matcal real 0m13.177s -> this is the time you should be looking at user 0m13.133s sys 0m0.022s
13 seconds, it is very slow. By the way, without matrix multiplication, it takes 0.048 s, with large matrices, which in your example is 0.9s. Why?
Using compiler optimizations with Eigen is very important. 2 / Compilation with some optimization:
g++ -I/usr/include/eigen3 matcal.cpp -O2 -o matcal time ./matcal real 0m0.324s user 0m0.298s sys 0m0.024s
Now 0.324s, it's better!
3 / Toggle all optimization flags (at least everything that I know, I'm not an expert in this field)
g++ -I/usr/include/eigen3 matcal.cpp -O3 -march=corei7 -mtune=corei7 -o matcal time ./matcal real 0m0.317s user 0m0.291s sys 0m0.024s
0.317, close, but several ms received (sequentially for several tests). Therefore, in my opinion, you have a problem using Eigen, either you do not enable compiler optimization, or your compiler does not do this on its own.
I am not an expert at Eigen. I only used it a few times, but I think the documentation is not bad, and you should probably read it to make the most of it.
Regarding performance comparisons with MatLab, the last time I read about Eigen it was not multithreaded, and MatLab probably used multithreaded libraries. For matrix multiplication, you can split your matrix into several pieces and parallelize the multiplication of each fragment using TBB