Two-element replacement performance in MATLAB

Purely as an experiment, I write down the sort functions in MATLAB, then run them through the MATLAB profiler. The aspect that I find most puzzled is related to the replacement of elements.

I found that the “official” way to replace two elements in a matrix

self.Data([i1, i2]) = self.Data([i2, i1]) 

works much slower than executing in four lines of code:

 e1 = self.Data(i1); e2 = self.Data(i2); self.Data(i1) = e2; self.Data(i2) = e1; 

The total length of time spent in the second example is 12 times less than one line of code in the first example.

Will anyone have an explanation why?

+6
performance matlab
source share
5 answers

Based on the suggestions I made, I did some more tests. It seems that a performance hit occurs when the same matrix refers to both LHS and RHS assignments.

My theory is that MATLAB uses an internal mechanism for counting links / copying to a record, and this leads to the fact that the entire matrix is ​​copied internally when it is referenced on both sides. (This is an assumption because I do not know the internals of MATLAB).

Here is the result of calling the function 885548 times. (The difference here is four times, not twelve times, as I wrote. Each of the functions has additional overhead, while in my first message I just summed up individual lines).

  swap1: 12.547 s
  swap2: 14.301 s
  swap3: 51.739 s

Here is the code:

  methods (Access = public) function swap(self, i1, i2) swap1(self, i1, i2); swap2(self, i1, i2); swap3(self, i1, i2); self.SwapCount = self.SwapCount + 1; end end methods (Access = private) % % swap1: stores values in temporary doubles % This has the best performance % function swap1(self, i1, i2) e1 = self.Data(i1); e2 = self.Data(i2); self.Data(i1) = e2; self.Data(i2) = e1; end % % swap2: stores values in a temporary matrix % Marginally slower than swap1 % function swap2(self, i1, i2) m = self.Data([i1, i2]); self.Data([i2, i1]) = m; end % % swap3: does not use variables for storage. % This has the worst performance % function swap3(self, i1, i2) self.Data([i1, i2]) = self.Data([i2, i1]); end end 
+6
source share

In the first (slow) approach, the value of RHS is a matrix, so I believe that MATLAB takes a penalty for performance when creating a new matrix for storing two elements. The second (fast) approach allows you to avoid this by working directly with the elements.

Check out " Performance Improvement Methods " in a MathWorks article on ways to improve your MATLAB code.

+4
source share

You can also do:

 tmp = self.Data(i1); self.Data(i1) = self.Data(i2); self.Data(i2) = tmp; 
+2
source share

Zach is potentially right that a temporary copy of the matrix can be made to perform the first operation, although I would jeopardize that there is some internal optimization inside MATLAB that tries to avoid this. This may be the function of the MATLAB version you are using. I tried both of your cases in version 7.1.0.246 (a couple of years) and saw only a difference in speed of about 2-2.5.

Perhaps this could be an example of a speed improvement caused by a "roll cycle". When performing vector operations at some level inside the inner code, there is a FOR loop that iterates over the indices that you change. By performing scalar operations in the second example, you avoid any overhead from the loops. Pay attention to these two (somewhat silly) examples:

 vec = [1 2 3 4]; %Example 1: for i = 1:4, vec(i) = vec(i)+1; end; %Example 2: vec(1) = vec(1)+1; vec(2) = vec(2)+1; vec(3) = vec(3)+1; vec(4) = vec(4)+1; 

Admittedly, it would be much easier to just use vector operations, such as:

 vec = vec+1; 

but the above examples are for illustration. When I repeat each example several times and their time, example 2 is actually slightly faster than example 1. For a small cycle with a known number (in the example, only 4), it can actually be more efficient to abandon the cycle. Of course, in this particular example, the above vector operation is the fastest.

I usually follow this rule: try a few different things and choose the fastest for your specific problem.

+2
source share

This post is worth updating because the JIT compiler is now a thing ( with R2015b ) and therefore timeit ( with R2013b ) for more reliable function synchronization.

Below is a short benchmarking function for elementary substitution in a large array. I used the terms “direct switching” and “using a temporary variable” to describe the two methods in the question, respectively.

The results are quite stunning, the performance of directly exchanging 2 used elements is getting worse and worse than using a temporary variable.

plot

 function benchie() % Variables for plotting, loop to increase size of the arrays M = 15; D = zeros(1,M); W = zeros(1,M); for n = 1:M; N = 2^n; % Create some random array of length N, and random indices to swap v = rand(N,1); x = randi([1, N], N, 1); y = randi([1, N], N, 1); % Time the functions D(n) = timeit(@()direct); W(n) = timeit(@()withtemp); end % Plotting plot(2.^(1:M), D, 2.^(1:M), W); legend('direct', 'with temp') xlabel('number of elements'); ylabel('time (s)') function direct() % Direct swapping of two elements for k = 1:N v([x(k) y(k)]) = v([y(k) x(k)]); end end function withtemp() % Using an intermediate temporary variable for k = 1:N tmp = v(y(k)); v(y(k)) = v(x(k)); v(x(k)) = tmp; end end end 
+1
source share

All Articles