Passing parameters to a function using dplyr

I have the following function to describe a variable

library(dplyr) describe = function(.data, variable){ args <- as.list(match.call()) evalue = eval(args$variable, .data) summarise(.data, 'n'= length(evalue), 'mean' = mean(evalue), 'sd' = sd(evalue)) } 

I want to use dplyr to describe a variable.

 set.seed(1) df = data.frame( 'g' = sample(1:3, 100, replace=T), 'x1' = rnorm(100), 'x2' = rnorm(100) ) df %>% describe(x1) # n mean sd # 1 100 -0.01757949 0.9400179 

The problem is that when I try to apply the same descriptive function group_by , the description function is not applied in every group

 df %>% group_by(g) %>% describe(x1) # # A tibble: 3 x 4 # gn mean sd # <int> <int> <dbl> <dbl> # 1 1 100 -0.01757949 0.9400179 # 2 2 100 -0.01757949 0.9400179 # 3 3 100 -0.01757949 0.9400179 

How would you change the function to get what you want using a small number of modifications?

+7
r dplyr
source share
2 answers

You need tidyeval:

 describe = function(.data, variable){ evalue = enquo(variable) summarise(.data, 'n'= length(!!evalue), 'mean' = mean(!!evalue), 'sd' = sd(!!evalue)) } df %>% group_by(g) %>% describe(x1) # A tibble: 3 x 4 gn mean sd <int> <int> <dbl> <dbl> 1 1 27 -0.23852862 1.0597510 2 2 38 0.11327236 0.8470885 3 3 35 0.01079926 0.9351509 

dplyr vignette ' Programming with dplyr ' contains a detailed description of the use of enquo and !!

Edit:

In response to Axeman's comment, I am not 100% why group_by and description do not work here. However, using debugonce with the functionality in it of the original form

 debugonce(describe) df %>% group_by(g) %>% describe(x1) 

it can be seen that evalue not grouped and is just a numeric vector of length 100.

+7
source share

Basic NSE also works:

 describe <- function(data, var){ var_q <- substitute(var) data %>% summarise(n = n(), mean = mean(eval(var_q)), sd = sd(eval(var_q))) } df %>% describe(x1) n mean sd 1 100 -0.1266289 1.006795 df %>% group_by(g) %>% describe(x1) # A tibble: 3 x 4 gn mean sd <int> <int> <dbl> <dbl> 1 1 33 -0.1379206 1.107412 2 2 29 -0.4869704 0.748735 3 3 38 0.1581745 1.020831 
0
source share

All Articles