Delete the first duplicate row and save the rest?

Question

Delete the first duplicate row and save the rest?

I want to remove duplicates based on the User column, but only the first instance where it appears.

DF:

Result: (Removed A1 and B1)

User  No  
A     2
A     3
A     4
C     1
B     2
D     1

I was unable to use the duplicate function.

Any help would be appreciated! Thank!

+4

r

ant Apr 20 '15 at 5:24

source share

3 answers

data.table. 'data.frame' 'data.table' (setDT(DF)). "", , (tail(.SD, -1)), .SD - Subset of Data.table. , "" . , if/else, , if 1 (.N>1), else (.SD).

library(data.table)
setDT(DF)[, if(.N>1) tail(.SD,-1) else .SD , by = User]
#   User No
#1:    A  2
#2:    A  3
#3:    A  4
#4:    B  2
#5:    C  1
#6:    D  1

, @MrFlick dplyr, duplicated .N ( ). "N", "", (.N==1), , TRUE N duplicated "User", duplicated TRUE duplicate, FALSE.

setDT(DF)[DF[, N:=.N==1, by = User][, N|duplicated(User)]][,N:=NULL][]

base R ave ('indx2'), , length "" 1 . duplicated, , .

indx2 <- with(DF, ave(seq_along(User), User, FUN=length)==1)
DF[duplicated(DF$User)|indx2,]
#   User No
#3    A  2
#4    A  3
#5    A  4
#6    C  1
#7    B  2
#8    D  1

+11

akrun 20 . '15 5:30

, , R

#data
DF=data.frame(User=c("A","B","A","A","A","C","B","D"),No=c(1,1,2,3,4,1,2,1))
#solution
subset(DF,duplicated(User)|!duplicated(User,fromLast=TRUE))

#  User No
#3    A  2
#4    A  3
#5    A  4
#6    C  1
#7    B  2
#8    D  1

:

subset(DF,logicalA|logicalB)

logicalA... selects all duplicate entries and therefore omits all users with exactly one entry
logicalB... selects all users with one record and selects the last record (see fromLast=TRUE) users with more than one row (the latter are selected logical anyway)

Hope I got it right. :)

+3

cryo111 Apr 20 '15 at 8:03

source share

Mrflick · Accepted Answer · 2015-04-20T05:32:52+0000

If I understand correctly, this should work

library(dplyr)
dd %>% group_by(User) %>% filter(duplicated(User) | n()==1)

Delete the first duplicate row and save the rest?

More articles: