Your explanation makes sense as well as code. However, you can fully get this vectorized by being smart about the features you want to use.
For interested readers, given your source code, what you want to do is the following (in pseudo-code):
Create a matrix U that size(Imlabel,1) x size(Imlabel,2) x num_class large. Each slice of this matrix will determine what Gibbs energy will cost 8 neighboring pixels in each coordinate (i,j) in this fragment.
For each class label x ...
a. For each pixel in the image (i,j) ...
Find those locations that are defined in an 8-pixel neighborhood (i,j) that are x and set them to beta
Find those locations that are defined in an 8-pixel neighborhood (i,j) that are not equal to x and set them to -beta
Calculate the sum of all these beta and -beta values within this 8-pixel neighborhood and set this to the location U(i,j,x)
So, what would I do, create a 3D cost matrix ... let this C , where each piece of y has the same dimensions as size(Imlabel,1) x size(Imlabel,2) , but each location (i,j) the slice will be beta if Imlabel(i,j) == y and -beta otherwise. Once you do this, you can convolve the ND of this matrix, but using the 2D core, so that you can calculate the sum of the 8-pixel neighborhoods of each slice separately.
Today's magic menu consists of bsxfun , convn and lateral permute order to sweeten the pot.
To get the cost matrix C , you can do this. This suggests that Imlabel not padded with zeros along the borders:
Imlabel =[ 1 1 1 1 1 1 1 1 1 1; 1 1 1 1 1 1 1 1 1 1; 1 1 1 1 1 1 1 2 1 1; 1 1 1 1 1 1 1 1 1 1; 1 1 2 2 2 1 1 1 1 1; 1 1 2 2 1 1 1 1 1 1; 1 1 1 1 1 1 1 1 2 1; 1 1 1 1 1 1 1 1 1 1; 1 1 1 1 1 1 1 1 1 1]; C = double(bsxfun(@eq, Imlabel, permute(1:num_class, [1 3 2]))); C(C == 0) = -beta; C(C == 1) = beta;
The permute function permute used to create a single 3D vector that comes from 1 to several classes, as you define. The reason this is required is because bsxfun does what is called broadcasting . What happens in this case is given by your Imlabel matrix, using a 3D vector will create a 3D cost matrix in combination with the eq or equals function. Each slice of this cost matrix will give you places where it is either equal to the corresponding label or not.
You can check the correctness of what we have for the location matrix, which is derived from bsxfun :
>> C = double(bsxfun(@eq, Imlabel, permute(1:num_class, [1 3 2]))) C(:,:,1) = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 C(:,:,2) = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
As you can see, for each fragment we can see which locators share this label, which is indicated by this slice, and those that do not use the same label. After we have these locations, we need to assign it to double so that we can change this as a cost matrix, where -beta is a location that is not equal to the label for a particular slice, and beta -, This is done by two assignment operators after calling bsxfun .
Once you have this cost matrix, you can use the borders as before, but make sure the borders are set to -beta , as it was in your code:
mask = zeros(3,3); mask(2,2) = 1; Cpad = convn(C, mask); Cpad(Cpad == 0) = -beta;
The first line of code indicates the mask [0 0 0; 0 1 0; 0 0 0] [0 0 0; 0 1 0; 0 0 0] [0 0 0; 0 1 0; 0 0 0] that you defined in your code. Now, in order for each individual slice to have a border of zeros, we can use convn to help us do this.
Now that we have set this up, all you need to do is calculate the Gibbs energy regardless of the cutoff:
mask2 = ones(3,3); mask2(2,2) = 0; out = convn(Cpad, mask2, 'valid');
The first line of code denotes a mask, where each value is 1, except for the middle, which is 0, and this is a 3 x 3 mask. The reason you want to do this is because it replaces the if/else in your dual for loop logic. What you do is essentially a convolution where you add up all 8 neighbors except the very middle. This can be done using a mask that has all 1 s, except for the center pixel, and you set the value to 0.
Then we use convn to calculate the Gibbs energy per slice independently, but using the 'valid' flag so that we can remove the zero-add frame included at the beginning.
Now, if you look at out , it compares with what you have with U , but there is a slight shift due to how you index in U However, with the output out above, you can check what we have right, and this will cope perfectly with borderline cases:
>> out out(:,:,1) = -2 2 2 2 2 2 2 2 2 -2 2 8 8 8 8 8 6 6 6 2 2 8 8 8 8 8 6 8 6 2 2 6 4 2 4 6 6 6 6 2 2 4 2 0 4 6 8 8 8 2 2 4 2 0 2 6 8 6 6 0 2 6 4 4 6 8 8 6 8 0 2 8 8 8 8 8 8 6 6 0 -2 2 2 2 2 2 2 2 2 -2 out(:,:,2) = -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8 -6 -6 -6 -8 -8 -8 -8 -8 -8 -8 -6 -8 -6 -8 -8 -6 -4 -2 -4 -6 -6 -6 -6 -8 -8 -4 -2 0 -4 -6 -8 -8 -8 -8 -8 -4 -2 0 -2 -6 -8 -6 -6 -6 -8 -6 -4 -4 -6 -8 -8 -6 -8 -6 -8 -8 -8 -8 -8 -8 -8 -6 -6 -6 -8 -8 -8 -8 -8 -8 -8 -8 -8 -8
For example, if we look at the upper left corner of the first fragment, we will see that the eastern, southern and southeastern corners have the same label 1, and since beta = 1 , our sum will be 3, but then there are others There are 5 directions that we did not consider, and in your code you set them to -beta and therefore 3 - 5 = -2 .
Let's also take a look at the 6th row, 4th column and look at the second label or slice. This means that any cardinal directions equal to label 2, we must sum them up with beta / 1 , while everything that is not equal to label 2 is equal to- -beta / -1 . As you can see, there are exactly 4 tags out of 1 and 4 tags out of 2, and therefore they should cancel and give us 0.
You can verify that what we have done for all other locations in this matrix leads to the correct calculations.
The full code is simple:
Imlabel =[ 1 1 1 1 1 1 1 1 1 1; 1 1 1 1 1 1 1 1 1 1; 1 1 1 1 1 1 1 2 1 1; 1 1 1 1 1 1 1 1 1 1; 1 1 2 2 2 1 1 1 1 1; 1 1 2 2 1 1 1 1 1 1; 1 1 1 1 1 1 1 1 2 1; 1 1 1 1 1 1 1 1 1 1; 1 1 1 1 1 1 1 1 1 1]; C = double(bsxfun(@eq, Imlabel, permute(1:num_class, [1 3 2]))); C(C == 0) = -beta; C(C == 1) = beta; mask = zeros(3,3); mask(2,2) = 1; Cpad = convn(C, mask); Cpad(Cpad == 0) = -beta; mask2 = ones(3,3); mask2(2,2) = 0; out = convn(Cpad, mask2, 'valid');
In the case of synchronization, we can check how quickly the above approach compares with your loop source code. I used timeit to facilitate this.
A script is set up here that sets all the relevant variables and measures the time spent on your code and mine:
function test_clique_timing % Setup Imlabel_orig =[ 1 1 1 1 1 1 1 1 1 1; 1 1 1 1 1 1 1 1 1 1; 1 1 1 1 1 1 1 2 1 1; 1 1 1 1 1 1 1 1 1 1; 1 1 1 1 1 1 1 1 1 1; 1 1 2 2 2 1 1 1 1 1; 1 1 2 2 1 1 1 1 1 1; 1 1 1 1 1 1 1 1 2 1; 1 1 1 1 1 1 1 1 1 1; 1 1 1 1 1 1 1 1 1 1]; num_class=2; % x={1,2} beta = 1; function test_orig mask=[ 0 0 0;0 1 0;0 0 0]; Imlabel=conv2(Imlabel_orig,mask); % To solve pixel in corner beta=1; for label=1:num_class for i=2:size(Imlabel,1)-1 for j=2:size(Imlabel,2)-1 sum_V=0; %North, south, east and west neighbors if(label==Imlabel(i-1,j)) sum_V=sum_V+beta; else sum_V=sum_V-beta;end if(label==Imlabel(i,j+1)) sum_V=sum_V+beta; else sum_V=sum_V-beta;end if(label==Imlabel(i+1,j)) sum_V=sum_V+beta; else sum_V=sum_V-beta;end if(label==Imlabel(i,j-1)) sum_V=sum_V+beta; else sum_V=sum_V-beta;end %% Diagonal pixels if(label==Imlabel(i-1,j-1)) sum_V=sum_V+beta; else sum_V=sum_V-beta;end if(label==Imlabel(i-1,j+1)) sum_V=sum_V+beta; else sum_V=sum_V-beta;end if(label==Imlabel(i+1,j+1)) sum_V=sum_V+beta; else sum_V=sum_V-beta;end if(label==Imlabel(i+1,j-1)) sum_V=sum_V+beta; else sum_V=sum_V-beta;end %% Save U(i-1,j-1,label)=sum_V; %% Because Index is extended before end end end end function test_conv C = double(bsxfun(@eq, Imlabel_orig, permute(1:num_class, [1 3 2]))); C(C == 0) = -beta; C(C == 1) = beta; mask = zeros(3,3); mask(2,2) = 1; Cpad = convn(C, mask); Cpad(Cpad == 0) = -beta; mask2 = ones(3,3); mask2(2,2) = 0; out = convn(Cpad, mask2, 'valid'); end time1 = timeit(@test_orig); time2 = timeit(@test_conv); fprintf('Loop code time: %f seconds\n', time1); fprintf('Vectorized code time: %f seconds\n', time2); fprintf('Speedup factor: %f', time1/time2); end
Running the above script, I get them for times:
Loop code time: 0.000049 seconds Vectorized code time: 0.000060 seconds Speedup factor: 0.816667
This is done in seconds, and I did it using MATLAB R2013a under Mac OS Mavericks 10.10.5. Acceleration doesn't look so great. In fact, this factor is less than 1, which is terrible. However, what you showed is such a small data set. We should try with a large dataset and see if this persists. Let's make a 2500 x 2500 matrix, with 10 class labels. I am going to replace the Imlabel matrix with:
rng(123); %// Set seed for reproducibility num_class = 10; Imlabel_orig = randi(10,2500,2500);
Generates random integers from 1 to 10 in a 2500 x 2500 matrix. Running this and running the code again gives:
Loop code time: 17.553669 seconds Vectorized code time: 3.321950 seconds Speedup factor: 5.284146
Yes ... it's much better. On a 2500 x 2500 matrix of 10 tags, it took about 17.5 seconds to calculate, while the vectorized code with bsxfun / convn / permute almost 5.2 times bsxfun / convn / permute than your loop source code.
Regarding the normalization term, which takes into account your Gibbs energy calculation, you currently have this:
P=zeros(size(U)); Z=sum(exp(-U),3); for i=1:num_class P(:,:,i)=exp(-U(:,:,i))./Z; end
If you use bsxfun , you can simply do this:
Z = sum(exp(-out),3)); %// out is U in my code P = bsxfun(@rdivide, exp(-out), Z);
This provides the same as your code. For each fragment in P we find exp and negate the slice as an input signal and divide each term by Z For bsxfun Z matrix is replicated for a large number of slices, as it is in out , and whether the element splitting is very similar to the loop code.