I have a row of data, say A = [0 1 1 1 0 0].
Matrix B contains many rows. For a fictitious example, we can say that it is simple B = [1 1 1 0 1 0; 1 0 0 1 0 1].
I want to find the number of columns in which A and row B are different, and use this difference vector to find which row B is most similar to A. So, for the example above, A is different from B(1,:)columns 1, 4, 5 = 3 of the total difference . A is different from B (2, :) in columns 1, 2, 3, 6 = 4 of the full differences, and so I would like to return index 1 to indicate that A is most similar to B (1, :).
In fact, B has ~ 50,000 rows , and A and B have about 800 columns . My current code is to find the most similar lines below:
min(sum(xor(repmat(A,B_rows,1),B),2));
It works, but it is very slow. Any understanding of which function takes so long and how to improve it?
source
share