The first group flag in the data frame R

Question

The first group flag in the data frame R

I have a data frame that looks like this:

id score 1 15 1 18 1 16 2 10 2 9 3 8 3 47 3 21

I would like to define a way to indicate the first occurrence of an identifier - similar to the first. And last. in SAS. I tried a duplicate function, but I need to actually add the “flag” column to my data frame, since I run it through the loop later. I would like to get something like this:

 id score first_ind 1 15 1 1 18 0 1 16 0 2 10 1 2 9 0 3 8 1 3 47 0 3 21 0

+7

r

davids12 Oct 08 '14 at 20:22

source share

4 answers

You can find edges using diff .

 x <- read.table(text = "id score 1 15 1 18 1 16 2 10 2 9 3 8 3 47 3 21", header = TRUE) x$first_id <- c(1, diff(x$id)) x id score first_id 1 1 15 1 2 1 18 0 3 1 16 0 4 2 10 1 5 2 9 0 6 3 8 1 7 3 47 0 8 3 21 0

+6

Roman Luštrik Oct 08 '14 at 20:28

source share

Using plyr :

 library("plyr") ddply(x,"id",transform,first=as.numeric(seq(length(score))==1))

or if you prefer dplyr :

 x %>% group_by(id) %>% mutate(first=c(1,rep(0,n-1)))

(although if you are fully working in the plyr / dplyr , you probably won't need this flag variable ...)

+3

Ben bolker Oct 08 '14 at 20:32

source share

Another basic R option:

 df$first_ind <- ave(df$id, df$id, FUN = seq_along) == 1 df # id score first_ind #1 1 15 TRUE #2 1 18 FALSE #3 1 16 FALSE #4 2 10 TRUE #5 2 9 FALSE #6 3 8 TRUE #7 3 47 FALSE #8 3 21 FALSE

This also works in the case of unsorted id s. If you want 1/0 instead of T / F, you can easily wrap it in as.integer(.) .

+2

docendo discimus Oct 9 '14 at 11:14

source share

Jilber urbina · Accepted Answer · 2014-10-08T20:29:14+0000

 > df$first_ind <- as.numeric(!duplicated(df$id)) > df id score first_ind 1 1 15 1 2 1 18 0 3 1 16 0 4 2 10 1 5 2 9 0 6 3 8 1 7 3 47 0 8 3 21 0

The first group flag in the data frame R

More articles: