Delete duplicate rows in two columns at the same time.

Question

Delete duplicate rows in two columns at the same time.

I would like to remove duplicate rows based on two columns, not just one.

My input df:

RAW.PVAL  GR     allrl  Bak
0.05      fr     EN1    B12
0.05      fg     EN1    B11
0.45      fr     EN2    B10
0.35      fg     EN2    B066

My conclusion:

RAW.PVAL  GR  allrl   Bak
0.05      fr   EN1    B12
0.45      fg   EN2    B10
0.35      fg   EN2    B066

I tried df<- subset(df, !duplicated(allrl, RAW.PVAL)), but I do not work to delete rows with two duplicate columns at the same time.

Thank!

+4

r duplicates subset

user3091668 Aug 14 '14 at 6:32

source share

2 answers

Use unique()to remove duplicate values.

-2

Bharathi Aug 14 '14 at 6:37

source share

akrun · Accepted Answer · 2014-08-14T06:36:35+0000

If you want to use subset, you can try:

  subset(df, !duplicated(subset(df, select=c(allrl, RAW.PVAL))))
 # RAW.PVAL GR allrl  Bak
 #1     0.05 fr   EN1  B12
 #3     0.45 fr   EN2  B10
 #4     0.35 fg   EN2 B066

But I think the @thelatemail approach would be better

  df[!duplicated(df[c("RAW.PVAL","allrl")]),]

Delete duplicate rows in two columns at the same time.

More articles: