Data.table update join using constant

Question

Data.table update join using constant

I would like to update a subset of rows in a data.table based on a join and a fixed value.

d1 <- data.table(A = 5:1, B = letters[5:1])    
d2 <- data.table(C = letters[5:1], Z = 6:10)
current.val <- 5

what I want to do is update d1 based on the connection to d2, but only where A == 5 in d1. something like that:

d1[d2, D := i.Z ,on=.(B==C, A==current.val)]

my current approach is to add a new column in d2 and set it to a fixed value and use it in the join:

d2[, current.val := 5]
d1[d2, D := i.Z ,on=.(B==C, A==current.val)]

This works, but there seems to be a lot of overhead. Is there an easier way to use a constant value in a join?

(8/14) New example of a scale for comparison:

d1 <- data.table(A = 100:1, B = 100000000:1, D = as.numeric(NA),  key = c("A", "B"))
d2 <- data.table(C = 100000000:1, Z = c(10:1) / 10, key = "C")
current.val <- 5

system.time(d1[cbind(d2, A = current.val), on = .(B = C, A), D := i.Z])
system.time({setkey(d1, B, A); d1[d1[d2][A == current.val], D := Z]; setkey(d1, A, B)})
system.time(d1[d1[d2][A == current.val], D := Z]) # fastest, if inverse key order is acceptable

+2

performance join r data.table

Ethan Aug 13 '17 at 17:54

source share

2 answers

Frank · Answer 1 · 2017-08-13T18:43:06+0000

- d2 [...]. ?

. cbind:

d1[cbind(d2, A = current.val), on=.(B = C, A), D := i.Z ]

, c cbind , .

gcbenison · Answer 2 · 2017-08-14T14:20:28+0000

, , , d1 d2:

d1 <- data.table(A=5:1, B=letters[5:1], key="B")
d2 <- data.table(C=letters[5:1], Z=6:10, key="C")

:

d1[d1[d2][A==current.val], D := Z]

Data.table update join using constant

More articles: