Use unique rows from data.frame to a subset of another data.frame

Question

Use unique rows from data.frame to a subset of another data.frame

I have a data.frame vthat I would like to use unique rows from

#v
  DAY MONTH YEAR
1   1     1 2000
2   1     1 2000
3   2     2 2000
4   2     2 2000
5   2     3 2001

for a subset of data.frame w.

# w
  DAY MONTH YEAR V1 V2 V3
1   1     1 2000  1  2  3
2   1     1 2000  3  2  1
3   2     2 2000  2  3  1
4   2     2 2001  1  2  3
5   3     4 2001  3  2  1

The result is data.frame vw. Where there are only lines in 'w' that correspond to unique lines (e.g. (DAY, MONTH, YEAR)) in v.

# vw
  DAY MONTH YEAR V1 V2 V3
1   1     1 2000  1  2  3
2   2     2 2000  2  3  1

Right now I am using the code below, where I am combining data.framesand then using ddplyto select only unqiue / first instance of row. This work, but will become cumbersome if I have to turn it on V1=x$V1[1], etc. For all my variables in the ddplycode part . Is there a way to use the first instance (DAY, MONTH, YEAR)and the rest of the columns in this row?

, data.frame data.frame?

v <- structure(list(DAY = c(1L, 1L, 2L, 2L, 2L), MONTH = c(1L, 1L, 
2L, 2L, 3L), YEAR = c(2000L, 2000L, 2000L, 2000L, 2001L)), .Names = c("DAY", 
"MONTH", "YEAR"), class = "data.frame", row.names = c(NA, -5L
))

w <- structure(list(DAY = c(1L, 1L, 2L, 2L, 3L), MONTH = c(1L, 1L, 
2L, 2L, 4L), YEAR = c(2000L, 2000L, 2000L, 2001L, 2001L), V1 = c(1L, 
3L, 2L, 1L, 3L), V2 = c(2L, 2L, 3L, 2L, 2L), V3 = c(3L, 1L, 1L, 
3L, 1L)), .Names = c("DAY", "MONTH", "YEAR", "V1", "V2", "V3"
), class = "data.frame", row.names = c(NA, -5L))

vw_example <- structure(list(DAY = 1:2, MONTH = 1:2, YEAR = c(2000L, 2000L), 
    V1 = 1:2, V2 = 2:3, V3 = c(3L, 1L)), .Names = c("DAY", "MONTH", 
"YEAR", "V1", "V2", "V3"), class = "data.frame", row.names = c(NA, 
-2L))

wv_inter <- merge(v, w, by=c("DAY","MONTH","YEAR"))

vw <- ddply(www,.(DAY, MONTH, YEAR),function(x) data.frame(DAY=x$DAY[1],MONTH=x$MONTH[1],YEAR=x$YEAR[1], V1=x$V1[1], V2=x$V2[1], V3=x$V3[1]))

+4

r unique dataframe plyr subset

nofunsally 18 . '13 21:38

3

 library(data.table)
 v <- data.table(v)
 w <- data.table(w)

 setkey(v)
 setkeyv(w, names(v))

 # if you want to capture ALL unique values of `v`, use: 
 w[unique(v, by=NULL)]

 # if you want only values that mutually exist in `v` and `w` use: 
 w[unique(v, by=NULL), nomatch=0L]

+3

Ricardo Saporta 18 . '13 21:42

:

v w, vw v v w, , DAY MONTH YEAR.

vw <- merge(v, w, by=c("DAY","MONTH","YEAR"))
vw <- vw[which( ! duplicated(vw[,c("DAY","MONTH","YEAR")]) ), ]

+1

keithing 18 . '13 21:48

Blue Magister · Accepted Answer · 2013-12-18T21:44:48+0000

R unique of v . merge , by .

vw <- merge(unique(v), w)

( ), , (untested):

vw <- ddply(www,.(DAY, MONTH, YEAR),function(x) x[1,])

Use unique rows from data.frame to a subset of another data.frame

More articles: