The first group flag in the data frame R

I have a data frame that looks like this:

id score 1 15 1 18 1 16 2 10 2 9 3 8 3 47 3 21 

I would like to define a way to indicate the first occurrence of an identifier - similar to the first. And last. in SAS. I tried a duplicate function, but I need to actually add the β€œflag” column to my data frame, since I run it through the loop later. I would like to get something like this:

 id score first_ind 1 15 1 1 18 0 1 16 0 2 10 1 2 9 0 3 8 1 3 47 0 3 21 0 
+7
r
source share
4 answers
 > df$first_ind <- as.numeric(!duplicated(df$id)) > df id score first_ind 1 1 15 1 2 1 18 0 3 1 16 0 4 2 10 1 5 2 9 0 6 3 8 1 7 3 47 0 8 3 21 0 
+15
source share

You can find edges using diff .

 x <- read.table(text = "id score 1 15 1 18 1 16 2 10 2 9 3 8 3 47 3 21", header = TRUE) x$first_id <- c(1, diff(x$id)) x id score first_id 1 1 15 1 2 1 18 0 3 1 16 0 4 2 10 1 5 2 9 0 6 3 8 1 7 3 47 0 8 3 21 0 
+6
source share

Using plyr :

 library("plyr") ddply(x,"id",transform,first=as.numeric(seq(length(score))==1)) 

or if you prefer dplyr :

 x %>% group_by(id) %>% mutate(first=c(1,rep(0,n-1))) 

(although if you are fully working in the plyr / dplyr , you probably won't need this flag variable ...)

+3
source share

Another basic R option:

 df$first_ind <- ave(df$id, df$id, FUN = seq_along) == 1 df # id score first_ind #1 1 15 TRUE #2 1 18 FALSE #3 1 16 FALSE #4 2 10 TRUE #5 2 9 FALSE #6 3 8 TRUE #7 3 47 FALSE #8 3 21 FALSE 

This also works in the case of unsorted id s. If you want 1/0 instead of T / F, you can easily wrap it in as.integer(.) .

+2
source share

All Articles