Vector-Matrix animation is very slow in the OpenCV C ++ interface

I determined using the "Random-Stop-Method" method that the following two lines look very slow:

cv::Mat pixelSubMue = pixel - vecMatMue[kk_real]; // ca. 35.5 % cv::Mat pixelTemp = pixelSubMue * covInvRef; // ca. 58.1 % cv::multiply(pixelSubMue, pixelTemp, pixelTemp); // ca. 0 % cv::Scalar sumScalar = cv::sum(pixelTemp); // ca. 3.2 % double cost = sumScalar.val[0] * 0.5 + vecLogTerm[kk_real]; // ca. 3.2 % 
  • vecMatMue[kk_real] - this is std::vector<cv::Mat> <- I know that there are many copies, but using pointers here does not greatly affect performance.
  • pixelSubMue is a vector cv::Mat(1, 3, CV_64FC1)
  • covInvRef is a reference to the matrix cv::Mat(3, 3, CV_64FC1)
  • vecLogTerm[kk_real] is a std::vector<double>

The code snippet above is in the inner loop, which is called millions of times.

Question : Is there a way to increase the speed of this operation?

Edit : Thanks for the comments! I now measured the time in the program, and percentages show how much time is spent on each line. The measurements were carried out in the release mode. I took six measurements each time the code was executed millions of times.

I should probably also mention that std::vector objects do not affect performance, I just replaced them with constant objects.

Change 2 . I also implemented an algorithm using C-Api. The corresponding lines are as follows:

 cvSub(pixel, vecPMatMue[kk], pixelSubMue); // ca. 24.4 % cvMatMulAdd(pixelSubMue, vecPMatFCovInv[kk], 0, pixelTemp); // ca. 39.0 % cvMul(pixelSubMue, pixelTemp, pixelSubMue); // ca. 22.0 % CvScalar sumScalar = cvSum(pixelSubMue); // ca. 14.6 % cost = sumScalar.val[0] * 0.5 + vecFLogTerm[kk]; // ca. 0.0 % 

A C ++ implementation requires the same ca input. 3100 ms, whereas for the implementation of C-C only approx. 2050 ms (both measurements refer to the total time to execute a fragment millions of times). But I still prefer my implementation in C ++, as it is easier to read for me (other ugly changes had to be made to make it work with the C-API).

Change 3 . I rewrote the code without using any function calls for actual calculations:

 capacity_t mue0 = meanRef.at<double>(0, 0); capacity_t mue1 = meanRef.at<double>(0, 1); capacity_t mue2 = meanRef.at<double>(0, 2); capacity_t sigma00 = covInvRef.at<double>(0, 0); capacity_t sigma01 = covInvRef.at<double>(0, 1); capacity_t sigma02 = covInvRef.at<double>(0, 2); capacity_t sigma11 = covInvRef.at<double>(1, 1); capacity_t sigma12 = covInvRef.at<double>(1, 2); capacity_t sigma22 = covInvRef.at<double>(2, 2); mue0 = p0 - mue0; mue1 = p1 - mue1; mue2 = p2 - mue2; capacity_t pt0 = mue0 * sigma00 + mue1 * sigma01 + mue2 * sigma02; capacity_t pt1 = mue0 * sigma01 + mue1 * sigma11 + mue2 * sigma12; capacity_t pt2 = mue0 * sigma02 + mue1 * sigma12 + mue2 * sigma22; mue0 *= pt0; mue1 *= pt1; mue2 *= pt2; capacity_t cost = (mue0 + mue1 + mue2) / 2.0 + vecLogTerm[kk_real]; 

Now calculations for each pixel need only 150 ms!

0
c ++ performance opencv
source share
1 answer

It looks like you are compiling a debugging mode, which probably explains the performance hit. You can profile your code with temporary functions like clock() .

eg.

 clock_t start,end; ... start = clock(); cv::Mat pixelTemp = pixelSubMue * covInvRef; // Very SLOW! end = clock(); cout<<"Elapsed time in seconds: "<<(static_cast<double>(end)-start)/CLK_TCK<<endl; 
+1
source share

All Articles