I am trying to remove duplicate rows from a data frame based on the maximum value in another column
So for the data frame:
df<-data.frame (rbind(c("a",2,3),c("a",3,4),c("a",3,5),c("b",1,3),c("b",2,6),c("r",4,5))
colnames(df)<-c("id","val1","val2")
id val1 val2
a 2 3
a 3 4
a 3 5
b 1 3
b 2 6
r 4 5
I would like to remove all duplicates by id with the condition that for the corresponding rows they will not have the maximum value for val2.
Thus, the data frame should become:
a 3 5
b 2 6
r 4 5
-> delete all duplicates, but keep the row with the maximum value for df $ val2 for a subset (df, df $ id == "a")
source
share