Say I have data.table as
sample<-data.table(id=c(1,1,2,2,3,3,3,4,4), name=c("apple","apple","orange","orange", "pear","pear","pear","banana","banana"), atr=c("pretty","ugly","bruised","delicious", "pear-shaped","bruised","infested", "too-ripe","perfect"), N=c(10,9,15,4,5,7,7,4,12))
I want to return essentially unique(sample[,list(id, name)]) , except that I also need the atr column for the value with the largest N. In cases where there is a relationship for the highest N, I don't care which out of two but I want only one to be selected.
It almost works merge(sample[,list(N=max(N)),by=list(id,name1)], sample,by=c("id","name1","N")) , but since the pear has two atr values that bind for max, this returns two pears. Besides the fact that it does not give the expected result, I also assume / hope that there is a way to do this that is not related to the connection.