How do you do conditional "left join" in R?

Question

How do you do conditional "left join" in R?

I found that I did a “conditional left join” several times in R. To illustrate an example; if you have two data frames, for example:

> df ab 1 1 0 2 2 0 > other.df ab 1 2 3

The goal is to end this data frame:

 > final.df ab 1 1 0 2 2 3

The code I have written so far:

 c <- merge(df, other.df, by=c("a"), all.x = TRUE) c[is.na(c$by),]$by <- 0 d<-subset(c, select=c("a","by")) colnames(d)[2]<-b

to finally come up with the result that I wanted.

Doing this in four steps makes the code very opaque. Is there a better, less cumbersome way to do this?

+2

r conditional left-join

svenski Jul 6 '12 at 21:25

source share

2 answers

In two lines:

 c <- merge(df, other.df,all=T) c=c[which(!duplicated(c$a)),]

Thus, it takes values from both datasets and omits rows with duplicate id from the second. I'm not sure what is left and what is right, so if you want another: flip the data upside down and do the same.

 c=c[length(c$a):1,] c=c[which(!duplicated(c$a)),]

0

Seth Jul 6 '12 at 21:48

source share

G. grothendieck · Accepted Answer · 2012-07-10T22:53:30+0000

Here are two ways. In both cases, the first row does a left merge, returning the required columns. In the case of merge we need to set the names. The final line in both lines replaces NA with 0 .

mergers

 res1 <- merge(df, other.df, by = "a", all.x = TRUE)[-2] names(res1) <- names(df) res1[is.na(res1)] <- 0

sqldf

 library(sqldf) res2 <- sqldf("select a, ob from df left join 'other.df' o using(a)") res2[is.na(res2)] <- 0

How do you do conditional "left join" in R?

More articles: