I think the approach using ave is reasonable. But there are many ways to solve this problem. I will show several other ways to use the R base. Then in the last two examples I will introduce the data.table package.
Again, just throwing it away to provide some use cases for different aspects of the language.
d1 <- data.frame(ID=c(1,1,1,2,3,3,4,4,4), Year=c(2009,2010,2011, 2009,2009, 2010, 2009, 2010, 2011), V1=c(33, 67, 38, 45, 65, 74, 47, 51, 14)) # long way use_years <- as.character(2009:2011) cnts <- table(d1[,c("ID","Year")])[,use_years] use_id <- rownames(cnts)[rowSums(cnts)==length(use_years)] d1[d1[,"ID"]%in%use_id,] # 1 1 2009 33 # 2 1 2010 67 # 3 1 2011 38 # 7 4 2009 47 # 8 4 2010 51 # 9 4 2011 14 # another longish way ind1 <- d1[,"Year"]%in%2009:2011 d1_ind <- d1[ind1,"ID"] ind2 <- d1_ind %in% unique(d1_ind)[tabulate(d1_ind)==3] d1[ind1,][ind2,] # ID Year V1 # 1 1 2009 33 # 2 1 2010 67 # 3 1 2011 38 # 7 4 2009 47 # 8 4 2010 51 # 9 4 2011 14
OK, try a few methods using data.table. One of my favorite packages of all time. Maybe at first itβs a little tricky, so make sure your shoes are hard (oh, yes, that's fast!) :)
# medium way library(data.table) d2 <- as.data.table(d1) d2[ID%in%d2[Year%in%2009:2011, list(logic=nrow(.SD)==3),by="ID"][(logic),ID]] # ID Year V1 # 1: 1 2009 33 # 2: 1 2010 67 # 3: 1 2011 38 # 4: 4 2009 47 # 5: 4 2010 51 # 6: 4 2011 14 # short way d2[Year%in%2009:2011][ID%in%unique(ID)[table(ID)==3]] # ID Year V1 # 1: 1 2009 33 # 2: 1 2010 67 # 3: 1 2011 38 # 4: 4 2009 47 # 5: 4 2010 51 # 6: 4 2011 14