I am trying to create a new column that indicates if an identifier was present in the previous group. Here is my details:
data <- data.table(ID = c(1:3, c(9,2,3,4),c(5,1)),
groups = c(rep(c("a", "b", "c"), c(3, 4,2))))
ID groups
1: 1 a
2: 2 a
3: 3 a
4: 9 b
5: 2 b
6: 3 b
7: 4 b
8: 5 c
9: 1 c
I am not sure how to indicate lagging groups. I tried to use shift, but it does not work:
data[,.(ID=ID,match_lagged=ID %in% shift(ID)),by=groups]
Here is my desired result.
The first three lines are not matched because there is no previous group. FALSE will also work for these three lines. ID = 4 (in group b) does not match in group a. ID = 5 (in group c) does not match in group b.
Note that identifier 1 in group c does not match in group b, so it must be false even if it exists in group a. That is why it duplicated(data$ID)does not work. Data from the group must be matched with the previous group .
groups ID match_lagged
1: a 1 NA
2: a 2 NA
3: a 3 NA
4: b 9 FALSE
5: b 2 TRUE
6: b 3 TRUE
7: b 4 FALSE
8: c 5 FALSE
9: c 1 FALSE
DecisionA dplyr .