How can I find the last item among a set of grouping variables?

I have a 150x2 matrix in which the first column contains numbers that can be considered “grouping variables” and the second column contains the values ​​associated with these grouping variables. So a small 12x2 version will look like this:

200741 5441 200741 5524 200741 5428 200742 5670 200742 5668 200742 5559 200742 5215 200743 5184 200743 5473 200743 5496 200743 5568 200743 5702 

I would like to find the last value of the values ​​associated with each grouping variable. Thus, the above example will give the last element of 5428 (for the variable 200741), the last element 5215 (for the variable 200742) and the last element 5702 (for grouping the variable 200743). After I find the unique values ​​of the grouping variables in coumn1, how do I take the last element in column2 corresponding to each grouping variable? How can this be done in Matlab?

+2
source share
4 answers

If the first column is sorted and contains positive integers, you can use accumarray (though stretching it a bit):

 result = nonzeros(accumarray(A(:,1), A(:,2), [], @(x) x(end), 0, true)); 

Notes:

  • The sorting requirement is due to the fact that, according to the documentation ,

    If the indices in subs not sorted relative to their linear indices, then accumarray may not always keep the data order in val when it passes them to fun

    and therefore @(x) x(end) will not always give the last element.

  • A rare version of accumarray (fifth argument true ) is used in case of large grouping values ​​(as in the example).
+4
source

This could be one approach assuming A as an input array -

 %// Sort the input matrix based on the column -1 values, %// just for cases when the "grouping variables" are not already sorted A = sortrows(A,1) %// Use diff to find out the row indices where "groups" switch %// to give us the last row indices for each "grouping", which %// could be used to index into second column of A for final output out = A([diff(A(:,1))~=0 ; true],2) 
+3
source

Assuming the matrix is ​​sorted by grouping numbers, as in the example, to get Boolean indices, you can do something like this: if your matrix a

 I=a(1:end-1, 1) ~= a(2:end, 1) 

It will store logical values ​​by indexes corresponding to the last "grouping" numbers, except the last. Therefore, to get the lines you want, just do a(I, :) . And don't forget the last result. Or, as a single line:

 [a( a(1:end-1, 1) ~= a(2:end, 1), : ); a(end, :)] 
+3
source

You can use unique to indicate the places of the first or last occurrence of a number; depending on the version, you may need to use the legacy parameter to make sure it returns the last index:

 [B,ind,~] = unique(A(:,1),'last','legacy'); out = A(ind,2); 
+3
source

Source: https://habr.com/ru/post/1213171/


All Articles