How to make a complex multi-column match in R /

I want to map two data frames based on conditional expressions to more than one column, but I cannot figure out how to do this. Therefore, if there are my data sets:

df1 <- data.frame(lower=c(0,5,10,15,20), upper=c(4,9,14,19,24), x=c(12,45,67,89,10))
df2 <- data.frame(age=c(12, 14, 5, 2, 9, 19, 22, 18, 23))

I want to match the age of df2, which falls into the range between the lower and upper in df1 in order to add an additional column in df2 containing the value of x in df1, where the age lies between the upper and lower. those. I want df2 to look like

age    x
12    67
14    67
 5    45
....etc. 

How can I achieve such a fit?

+4
source share
3 answers

I would choose the simple sapplyand "anded" condition in the selection df1$xas follows:

df2$x <- sapply( df2$age, function(x) { df1$x[ x >= df1$lower & x <= df1$upper ] })

which gives:

> df2
  age  x
1  12 67
2  14 67
3   5 45
4   2 12
5   9 45
6  19 89
7  22 10
8  18 89
9  23 10

12 , , :

> 12 >= df1$lower & 12 <= df1$upper
[1] FALSE FALSE  TRUE FALSE FALSE

, df1$x ,

+6

foverlaps data.table - , :

library(data.table)
setDT(df1)
setDT(df2)[,age2:=age]
setkey(df1,lower,upper)
foverlaps(df2, df1, by.x = names(df2),by.y=c("lower","upper"))[,list(age,x)]

#    age  x
# 1:  12 67
# 2:  14 67
# 3:   5 45
# 4:   2 12
# 5:   9 45
# 6:  19 89
# 7:  22 10
# 8:  18 89
# 9:  23 10
+6

findInterval

library(data.table) 
df2$x <- melt(setDT(df1), "x")[order(value), x[findInterval(df2$age, value)]]
#   age  x
# 1  12 67
# 2  14 67
# 3   5 45
# 4   2 12
# 5   9 45
# 6  19 89
# 7  22 10
# 8  18 89
# 9  23 10

,

  • , lower upper , x ,
  • ( findInterval).
  • , findInterval x, .

dplyr/tidyr

library(tidyr)
library(dplyr)
df1 %>%
  gather(variable, value, -x) %>%
  arrange(value) %>%
  do(data.frame(x = .$x[findInterval(df2$age, .$value)])) %>%
  cbind(df2, .)
#   age  x
# 1  12 67
# 2  14 67
# 3   5 45
# 4   2 12
# 5   9 45
# 6  19 89
# 7  22 10
# 8  18 89
# 9  23 10
+6

All Articles