Replacement for repmat in MATLAB

I have a function that does the following loop many, many times:

for cluster=1:max(bins), % bins is a list in the same format as kmeans() IDX output select=bins==cluster; % find group of values means(select,:)=repmat_fast_spec(meanOneIn(x(select,:)),sum(select),1); % (*, above) for each point, write the mean of all points in x that % share its label in bins to the equivalent row of means delta_x(select,:)=x(select,:)-(means(select,:)); %subtract out the mean from each point end 

Noting that repmat_fast_spec and meanOneIn are truncated versions of repmat() and mean() , respectively, I wonder if there is a way to do the assignment on a line labeled (*), which avoids repmat completely.

Any other thoughts on how to squeeze performance out of this thing would also be welcome.

+4
source share
5 answers

Here is a possible improvement to avoid REPMAT:

 x = rand(20,4); bins = randi(3,[20 1]); d = zeros(size(x)); for i=1:max(bins) idx = (bins==i); d(idx,:) = bsxfun(@minus, x(idx,:), mean(x(idx,:))); end 

Another possibility:

 x = rand(20,4); bins = randi(3,[20 1]); m = zeros(max(bins),size(x,2)); for i=1:max(bins) m(i,:) = mean( x(bins==i,:) ); end dd = x - m(bins,:); 
+1
source

One obvious way to speed up calculations in MATLAB is to create a MEX file. You can compile C code and do whatever operations you want. If you are looking for the fastest possible performance, turning the operation into a custom MEX file is likely to be the way to go.

+1
source

You can get some improvement using ACCUMARRAY .

 %# gather array sizes [nPts,nDims] = size(x); nBins = max(bins); %# calculate means. Not sure whether it might be faster to loop over nDims meansCell = accumarray(bins,1:nPts,[nBins,1],@(idx){mean(x(idx,:),1)},{NaN(1,nDims)}); means = cell2mat(meansCell); %# subtract cluster means from x - this is how you can avoid repmat in your code, btw. %# all you need is the array with cluster means. delta_x = x - means(bins,:); 
+1
source

First of all: format your code correctly, combine any operator, or assign spaces. I think your code is very difficult to understand, because it looks like a large frame of characters.

Further you can follow other answers and convert the code to C (mex) or Java, automatically or manually, but, in my humble opinion, this is the last resort. You should only do such things when your productivity is still low. On the other hand, your algorithm does not show obvious flaws.

But the first thing you should do when trying to increase productivity: profile. Use the MATLAB profiler to determine how much of your code is causing problems. How much do you need to improve this to meet your expectations? If you don’t know: first define this boundary, otherwise you will look for a needle in a haystack, which may not even be there in the first place. MATLAB will never be the fastest child on a block with respect to runtime, but it may be the fastest with respect to development time for certain types of operations. In this regard, it may be useful to sacrifice the clarity of MATLAB over the speed of execution of other languages ​​(C or even Java). But in this regard, you could also code everything in assembler to squeeze all the performance out of the code.

+1
source

Another obvious way to speed up computation in MATLAB is to make a Java library (similar to @aardvarkk's answer), since MATLAB is built on Java and has very good integration with custom Java libraries.

Java is easier to compile than C in the interface. In some cases, it can be slower than C, but the on-time compiler (JIT) in the Java virtual machine tends to be faster.

0
source

All Articles