How can I evaluate observations within groups in Stata?

Question

How can I evaluate observations within groups in Stata?

I have some data in Stata that looks like the first two columns:

group_id var_to_rank desired_rank ____________________________________ 1 10 1 1 20 2 1 30 3 1 40 4 2 10 1 2 20 2 2 20 2 2 30 3

I would like to create the rank of each observation within a group (group_id) according to one variable (var_to_rank). Usually for this purpose I used:

 gen id = _n

However, some of my observations (group_id = 2 in my small example) have the same ranking variable values, and this approach does not work.

I also tried using:

 egen rank

with various parameters, but cannot make my rank variables look like wish_rank.

Could you point me to a solution to this problem?

+4

stata

radek May 18, '11 at 10:42

source share

6 answers

The following works for me:

 bysort group_id: egen desired_rank=rank(var_to_rank)

+7

chl May 18 '11 at 11:40

source share

I attacked such a solution on Statalist :

 bysort group_id (var_to_rank) : gen rank = var_to_rank != var_to_rank[_n-1] by group_id : replace rank = sum(rank)

This seems to be the problem.

+5

radek May 18, '11 at 13:41

source share

@radek: you probably figured this out ... but that would be an easy (though not very elegant) solution:

 bysort group_id: egen desired_rank_HELP =rank(var_to_rank), field egen desired_rank =group(grup_id desired_rank_HELP) drop desired_rank_HELP

+3

sam Jan 4 '13 at 20:42

source share

Too much work. Easy and elegant. Try it.

gen wish_rank = int (var_to_rank / 10)

0

Lazy Aug 29 '13 at 11:29

source share

try this command, it works for me so well: egen newid=group(oldid)

0

bontey Jan 14 '14 at 7:57

source share

Nick cox · Accepted Answer · 2013-01-05T13:00:49+0000

I would say that this question is posed incorrectly for a better understanding. The goal is to group observations, those with the lowest value, which are assigned class 1, and the next lower one - to all assigned 2, and so on. This does not apply to most of the feelings that I have already mentioned, but Stata egen, rank() makes you part of this journey.

But the direct path mentioned in the Statalist topic mentioned above is simpler in spirit than any solution cited:

 bysort group_id (var_to_rank): gen desired_rank = sum(var_to_rank != var_to_rank[_n-1])

As soon as the data is sorted by var_to_rank , then when the values differ from the previous values at the beginning of each block of different values, the value 1 is the result of var_to_rank != var_to_rank[_n-1] ; otherwise 0 is the result. Summing these 1s and 0s cumulatively gives the desired variable. The prefix bysort does the sorting and ensures that all this is done separately in the groups defined by group_id . There is no need for egen at all (a command that many people who use Stata often find strange).

Declaration of Interest: The quoted quote from Statalist shows that when I asked a similar question, I also did not think about this solution in one.

How can I evaluate observations within groups in Stata?

More articles: