Ranking in Dataset Rows [sas]

Suppose I have a data set with columns n and p , so that each record in the data set contains a real number. I am looking for a way to rank p columns in each row. The result of this ranking should be a length vector of p ranks that take into account the links.

So let my dataset contain 5 columns. The first line can be something like row 1 = {10, 13, 3, 3, -4} . I would like to perform some operations on this line and eventually return the result row 1 ranks = {3, 4, 2, 2, 1} . The second line can be something like row 2 = {8, 3, -6, 5, 2} , and the result in this line should be row 2 ranks = {5, 3, 1, 4, 2} .

Is this functionality implemented in SAS? I have created code that does not take into account the links, but they are common enough that it would take an unreasonable amount of time to fix the ranking of lines that were done incorrectly.

+4
source share
4 answers

Interest Ask; here is one possible solution:

 data have; p1=10; p2=13; p3=3; p4=3; p5=-4; output; p1=8; p2=3; p3=-6; p4=5; p5=2; output; run; data want; set have; array p(*) p1-p5; array c(*) c1-c5; array r(*) r1-r5; /* Copy vector to temp array and sort */ do i=1 to dim(p); c(i) = p(i); end; call sortn(of c(*)); /* Search new sorted array for the original position */ do i=1 to dim(c); if i = 1 then rank=1; else if c(i) ne c(i-1) then rank + 1; do j=1 to dim(p); if p(j) = c(i) then do; r(j) = rank; end; end; end; /* PUT statement to see result in log */ put +3 p(*) / +3 c(*) / +3 r(*); drop ij rank c1-c5; run; 
+5
source

It seems to me that for this you will need several arrays.

  • Array 1: An array to store ranks
  • Array 2: an array for sorting values
  • Array 3: Source Unchanged Data

I don't have time right now to write code, but using something like this would make a big, heavy lift:

http://support.sas.com/kb/24/754.html

+3
source

You can also add this, although the OP said that it does not use IML if others find this useful search. IML is really the easiest way to solve this problem, since it is fundamentally a vector-matrix problem ...

 proc iml; p={10 13 3 3 -4, 5 6 5 2 3}; r=j(2,5,.); print pr; do i = 1 to nrow(p); r[i,]=ranktie(p[i,]); end; print pr; quit; 

It handles the attempts in a slightly different way than the OP, and therefore, some work will be required to make it exactly the same as the requested solution, but overall 1,2,5,2,5,4,5 [or 1,2 , 2,4, 5] is probably what you really want, not 1,2,2,3,4. 4 and 5 should remain 4 and 5, and not move to 3 and 4, when 2 and 3 ties.

+2
source

Just for fun, given OP's answer that you need a new ranked dataset, the PROC RANK method is used here. It’s probably not faster than a data step, but it may be simpler and easier to use in several situations, and with the added benefit that you really cannot make a mistake in coding (without actually failing it).

 data have; input id x1-x5; datalines; 1 10 13 3 3 -4 2 5 6 5 2 3 ;;;; run; proc transpose data=have out=temp; by id; var x1-x5; run; proc rank data=temp out=temprank; var col1; by id; run; proc transpose data=temprank out=want(drop=_name_ _label_); by id; var col1; id _name_; run; 
+2
source

All Articles