Multiple joins / merge with data.tables

I have two data.tables, DT and L:

> DT = data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9,key="x") > L=data.table(yv=c(1L:8L,12L),lu=c(letters[8:1],letters[12]),key="yv") > DT xyv 1: a 1 1 2: a 3 2 3: a 6 3 4: b 1 4 5: b 3 5 6: b 6 6 7: c 1 7 8: c 3 8 9: c 6 9 > L yv lu 1: 1 h 2: 2 g 3: 3 f 4: 4 e 5: 5 d 6: 6 c 7: 7 b 8: 8 a 9: 12 l 

I would like to independently find the corresponding lu value from L for column y and for column v in DT. The following syntax gives the correct result, but it is cumbersome to generate and then understand right away:

 > L[setkey(L[setkey(DT,y)],v)][,list(x,y=yv.1,v=yv,lu.1=lu.1,lu.2=lu)] xyv lu.1 lu.2 1: a 1 1 hh 2: a 2 3 gf 3: a 3 6 fc 4: b 4 1 eh 5: b 5 3 df 6: b 6 6 cc 7: c 7 1 bh 8: c 8 3 af 9: c 9 6 NA c 

( Edit : the original message has L[setkey(L[setkey(DT,y)],v)][,list(x,y=yv,v=yv.1,lu.1=lu,lu.2=lu.1)] above, which incorrectly mixed the columns y and v and looked for values.)

In SQL, this would be simple / simple:

 SELECT DT.*, L1.lu AS lu1, L2.lu AS lu2 FROM DT LEFT JOIN L AS L1 ON DT.y = L1.yv LEFT JOIN L AS L2 ON DT.v = L2.yv 

Is there a more elegant way to use data.table to make multiple connections? Note that I join the same table twice to another table in this example, but I am also interested in joining one table into several different tables.

+6
source share
1 answer

Great question. One trick is that i does not need to be keyed. Only x required.

There may be better ways. How about this:

 > cbind( L[DT[,list(y)]], L[DT[,list(v)]], DT ) yv lu yv lu xyv 1: 1 h 1 ha 1 1 2: 3 f 2 ga 3 2 3: 6 c 3 fa 6 3 4: 1 h 4 eb 1 4 5: 3 f 5 db 3 5 6: 6 c 6 cb 6 6 7: 1 h 7 bc 1 7 8: 3 f 8 ac 3 8 9: 6 c 9 NA c 6 9 

or, to illustrate, this is the same thing:

 > cbind( L[J(DT$y)], L[J(DT$v)], DT ) yv lu yv lu xyv 1: 1 h 1 ha 1 1 2: 3 f 2 ga 3 2 3: 6 c 3 fa 6 3 4: 1 h 4 eb 1 4 5: 3 f 5 db 3 5 6: 6 c 6 cb 6 6 7: 1 h 7 bc 1 7 8: 3 f 8 ac 3 8 9: 6 c 9 NA c 6 9 

merge can also be used if the following function request has been implemented:

FR # 2033 Add by.x and by.y to merge.data.table

+6
source

All Articles