Filter data using a global variable with the same name as the column name

library(dplyr) 

Toy dataset:

 df <- data.frame(x = c(1, 2, 3), y = c(4, 5, 6)) df xy 1 1 4 2 2 5 3 3 6 

This works great:

 df %>% filter(y == 5) xy 1 2 5 

This also works great:

 z <- 5 df %>% filter(y == z) xy 1 2 5 

But it fails

 y <- 5 df %>% filter(y == y) xy 1 1 4 2 2 5 3 3 6 

Apparently dplyr cannot distinguish between its column y and the global variable y . Is there a way to tell dplyr that the second y is a global variable?

+7
r dplyr
source share
2 answers

You can do:

 df %>% filter(y == .GlobalEnv$y) 

or

 df %>% filter(y == .GlobalEnv[["y"]]) 

or

both of them work in this context, but will not, if all this happens inside the function. But get will be:

 df %>% filter(y == get("y")) f = function(df, y){df %>% filter(y==get("y"))} 

So use get .

Or just use df[df$y==y,] instead of dplyr .

+7
source share

Access to the global environment can be obtained through the .GlobalEnv object:

 > filter(df, y==.GlobalEnv$y) xy 1 2 5 

Interestingly, using the globalenv() accessor function as a replacement for .GlobalEnv does not work in this scenario.

+6
source share

All Articles