What is a “good” R value when comparing two signals using cross-correlation?

Question

What is a “good” R value when comparing two signals using cross-correlation?

I apologize for being a bit verbatim: if you want to skip the whole mumbo jumbo background, you can see my question below.

This pretty much depends on the question I previously reported on how to compare two 1D (time-dependent) signals. One of the answers I got is to use the cross-correlation function (xcorr in MATLAB) that I made.

Background Information

Perhaps a little background information will be useful: I'm trying to implement an independent component analysis algorithm. One of my unofficial tests is (1) creating a test case: (a) generating 2 random vectors (1x1000), (b) combining vectors into a 2x1000 matrix (called "S") and multiplying this by 2x2 mixing the matrix (called "A" ) to give me a new matrix (let's call it “T”).

In short: T = A * S

(2) Then I run the ICA algorithm to generate the inverse mixing matrix (called "W"), (3) multiply "T" by "W" by (hopefully), giving me a reconstruction of the original signal matrix (called "X")

In short: X = W * T

(4) Now I want to compare "S" and "X". Although "S" and "X" are 2x1000, I just compare S(1,:) with X(1,:) and S(2,:) with X(2,:) , each of which is 1x1000, which makes their 1D signals. (I have one more step that ensures that these vectors are corresponding vectors for comparison with each other, and I also normalize the signals).

So my current difficulty is how to "evaluate" how close S(1,:) matches X(1,:) , as well as from S(2,:) to X(2,:) .

So far I have used something like: r1 = max(abs(xcorr(S(1,:), X(1,:)))

My question

Assuming that using the cross-correlation function is an acceptable way to compare the similarity of the two signals, what can be considered a good value of R for assessing the similarity of the signals? Wikipedia states that this is a very subjective area, and therefore I put off the best opinion of those who may have experience in this area.

As you understand, I do not proceed from the EE / DSP / statistical background (I am a medical student), so now I am experiencing a kind of “baptism through fire”, and I will appreciate all the help I can get. Thanks!

+4

algorithm statistics matlab signal-processing

oort Aug 17 '09 at 17:51

source share

6 answers

A good starting point is to understand what a perfect match will look like by calculating the autocorrelation for each signal (i.e., cross-correlate each signal with itself).

+1

tom10 Aug 17 '09 at 20:18

source share

THIS IS A FULL ANGLE - but I assume max (abs (xcorr (S (1, :), X (1, :))))> 0.8 means success. Just out of curiosity, what values do you get for max (abs (xcorr (S (1, :), X (2, :))))?

Another approach to testing your algorithm might be to compare A and W. If W is calculated correctly, it should be A ^ -1, so you can calculate a measure such as | A*W - me |? Perhaps you need to normalize along the A*W track.

Returning to the original question, I proceed from the background of the DSP, so I have to deal with fairly noiseless signals. I understand that it is not a luxury that you acquire in biology :), so my rating of 0.8 can be very optimistic. Perhaps you can find some literature in your field, even if they do not use cross-correlation exactly.

+1

mtrw Aug 17 '09 at 21:38

source share

Usually in such cases, people talk about the "false reception rate" and the "false rejection rate." The first describes how many times the algorithm says "similar" for dissimilar signals, the second on the contrary.

The choice of a threshold in this way becomes a compromise between these criteria. To make FAR = 0, the threshold must be 1 so that the FRR = 0 threshold is -1.

So, probably, you will need to decide which compromise between FAR and FRR will be acceptable in your situation, and this will give the correct threshold value.

Mathematically, this can be expressed in different ways. Just a few examples: 1. correct some bets at an acceptable value and minimize others 2. minimize the maximum (FRR, FAR) 3. minimize aFRR + bFAR

+1

maxim1000 Aug 18 '09 at 9:11

source share

Since they should be equal, the correlation coefficient should be high, between 0.99 and 1. I would also use the max and abs functions from your calculations.

EDIT: I spoke too soon. I confused cross-correlation with the correlation coefficient, which is completely different. My answer may be unimportant.

0

Adam crume Aug 17 '09 at 18:23

source share

I would agree that the result will be subjective. What is connected with the sum of the squared differences, element by element, will have some value. Two identical arrays would give 0 in this form. You must decide what value then becomes "bad." Make two different vectors that are not so bad, and find their cross-correlation coefficient, which will be used as a guide.

(in brackets: if you did a correlation coefficient where 1 or -1 would be great, and 0 would be terrible, I was told by biostatistics that the real value of 0.7 is extremely good. that this is not quite what you are doing, but a comment higher correlation coefficient.)

0

user32848 Aug 17 '09 at 20:30

source share

Jason s · Accepted Answer · 2009-08-19T13:08:38+0000

( change: if you directly answer the question about the values of R, see below)

One way to get close to this is to use cross-correlation. Keep in mind that you need to normalize the amplitudes and fix the delays: if you have signal S1 and signal S2 is identical in shape, but half the amplitude and delay by 3 samples, they are still perfectly correlated.

For instance:

 >> t = 0:0.001:1; >> y = @(t) sin(10*t).*exp(-10*t).*(t > 0); >> S1 = y(t); >> S2 = 0.4*y(t-0.1); >> plot(t,S1,t,S2);

They should have an ideal correlation coefficient. A way to calculate this is to use maximum cross-correlation:

 >> f = @(S1,S2) max(xcorr(S1,S2)); f = @(S1,S2) max(xcorr(S1,S2)) >> disp(f(S1,S1)); disp(f(S2,S2)); disp(f(S1,S2)); 12.5000 2.0000 5.0000

The maximum xcorr() value provides a time delay between signals. As for the amplitude adjustment, you can normalize the signals so that their self-correction is equal to 1.0, or you can reset this equivalent step to the following:

& rho; ² = f (S1, S2) ² / (f (S1, S1) * f (S2, S2);

In this case, & rho; ² = 5 * 5 / (12.5 * 2) = 1.0

You can decide for & rho; himself, i.e. & rho; = f (S1, S2) / sqrt (f (S1, S1) * f (S2, S2)), just keep in mind that both 1.0 and -1.0 are perfectly correlated (-1.0 has the opposite sign)

Try according to your signals!

as to which threshold to use for acceptance / rejection, it really depends on what signals you have. 0.9 and above is pretty good, but can be misleading. I would consider the remaining signal that you get after you subtract the correlated version. You can do this by looking at the time index of the maximum xcorr () value:

 >> t = 0:0.001:1; >> y = @(a,t) sin(a*t).*exp(-a*t).*(t > 0); >> S1=y(10,t); >> S2=0.4*y(9,t-0.1); >> f(S1,S2)/sqrt(f(S1,S1)*f(S2,S2)) ans = 0.9959

This looks pretty good for correlation. But try setting S2 with a scaled / shifted multiple of S1:

 >> [A,i]=max(xcorr(S1,S2)); tshift = i-length(S1); >> S2fit = zeros(size(S2)); S2fit(1-tshift:end) = A/f(S1,S1)*S1(1:end+tshift); >> plot(t,[S2; S2fit]); % fit S2 using S1 as a basis

 >> plot(t,[S2-S2fit]); % residual

The residual energy has some energy in it; To understand how much you can use:

 >> S2res=S2-S2fit; >> dot(S2res,S2res)/dot(S2,S2) ans = 0.0081 >> sqrt(dot(S2res,S2res)/dot(S2,S2)) ans = 0.0900

This suggests that the remainder has about 0.81% of the energy (9% rms amplitude) of the original signal S2. (the dot product of the 1D signal itself will always be equal to the maximum value of the mutual correlation of this signal with itself.)

I don’t think there is a silver bullet to answer how the two signals are similar to each other, but I hope I have given you some ideas that may be applicable to your circumstances.

What is a “good” R value when comparing two signals using cross-correlation?

Background Information

My question

More articles: