What does uniquetol do, exactly?

Question

What does uniquetol do, exactly?

The uniquetol function, introduced in R2015a, computes "unique elements within tolerance." In particular,

C = uniquetol(A,tol) returns unique elements in A using the tol tolerance.

But the problem of finding unique elements with a given tolerance has several solutions . Which one is really produced?

Let's look at two examples:

Let A = [3 5 7 9] with an absolute tolerance of 2.5 . The output signal may be [3 7] , or it may be [5 9] . Both solutions satisfy the requirements.
For A = [1 3 5 7 9] with an absolute tolerance of 2.5 output can be [1 5 9] or [3 7] . Thus, even the number of elements at the output can vary.

See this nice discussion about the transitivity problem that underlies the problem.

So how does uniquetol work? What result does it produce among several existing solutions?

+8

arrays matlab unique reverse-engineering

Luis mendo Sep 7 '17 at 16:32

source share

1 answer

Luis mendo · Answer 1 · 2017-09-07T16:32:20+0000

To simplify , I am considering a uniquetol version with one output and two inputs,

  C = uniquetol(A, tol);

where the first input is a double A vector. In particular, this means that:

The 'ByRows' parameter in uniquetol not used.
The first input is a vector. If this were not so, uniquetol implicitly linearize the column, as usual.
The second input, which determines the tolerance, is interpreted as follows :
Two values, u and v , are within tolerance if abs(uv) <= tol*max(abs(A(:)))
That is, the specified tolerance is relative by default. The actual tolerance used in comparisons is obtained by scaling to the maximum absolute value in A

Based on these considerations, it seems that the approach that uniquetol uses uniquetol :

Sort A
Select the first record of sorted A and set it as a control value (this value will need to be updated later).
Record the reference value in the output signal C
Skip subsequent entries of sorted A until an item is found that is outside the tolerance of the reference value. When this entry is found, grab it by the new reference value and return to step 3.

Of course, I am not saying that this is what uniquetol internally does. But the conclusion seems to be the same. So this is functionally equivalent to what uniquetol does.

The following code implements the approach described above (inefficient code, just to illustrate this).

 % Inputs A, tol % Output C tol_scaled = tol*max(abs(A(:))); % scale tolerance C = []; % initiallize output. Will be extended ref = NaN; % initiallize reference value to NaN. This will immediately cause % A(1) to become the new reference for a = sort(A(:)).'; if ~(a-ref <= tol_scaled) ref = a; C(end+1) = ref; end end

To test this, let's generate random data and compare the output of uniquetol and the code above:

 clear N = 1e3; % number of realizations S = 1e5; % maximum input size for n = 1:N; % Generate inputs: s = randi(S); % input size A = (2*rand(1,S)-1) / rand; % random input of length S; positive and % negative values; random scaling tol = .1*rand; % random tolerance (relative). Change value .1 as desired % Compute output: tol_scaled = tol*max(abs(A(:))); % scale tolerance C = []; % initiallize output. Will be extended ref = NaN; % initiallize reference value to NaN. This will immediately cause % A(1) to become the new reference for a = sort(A(:)).'; if ~(a-ref <= tol_scaled) ref = a; C(end+1) = ref; end end % Check if output is equal to that of uniquetol: assert(isequal(C, uniquetol(A, tol))) end

In all my tests, this was done without validating the statement.

So in summary, uniquetol seems to sort the input, select its first entry and skip entries as long as possible.

For the two examples in the question, the results are as follows. Note that the second input is listed as 2.5/9 , where 9 is the maximum of the first input to achieve an absolute tolerance of 2.5 :

 >> uniquetol([1 3 5 7 9], 2.5/9) ans = 1 5 9 >> uniquetol([3 5 7 9], 2.5/9) ans = 3 7

What does uniquetol do, exactly?

More articles: