How to apply a function to an entire table in a dplyr chain

I have a dplyr circuit as follows

myResults <- rawData %>% filter(stuff) %>% mutate(stuff) 

Now I want to apply the myFunc function to myResults . Is there a way to do this in a chain, or do I need to basically do:

 myResults <- myFunc(myResult) 
+7
r dplyr
source share
2 answers

If a function takes a data frame as the first argument, you can simply add it to the end.

 > myFunc <- function(x) sapply(x, max) > mtcars %>% filter(mpg > 20) %>% myFunc() mpg cyl disp hp drat wt qsec vs am gear 33.900 6.000 258.000 113.000 4.930 3.215 22.900 1.000 1.000 5.000 carb 4.000 

It should be noted that magrittr::%>% , which is used by dplyr , works with any argument, so you can easily do something like this:

 > inc <- function(x) x + 1 > 1 %>% inc(.) %>% sqrt(.) %>% log(.) [1] 0.3465736 

and with some useful magrittr magrittr :

 library(magrittr) set.seed(1) inTrain <- sample(1:nrow(mtcars), 20) mtcarsTest <- mtcars %>% extract(-inTrain, ) summaryPipe <- function(x) {print(summary(x)); x} mtcars %>% extract(inTrain, ) %>% # Train lm lm(mpg ~ ., .) %>% # Print summary and forward lm results summaryPipe %>% # Predict on the test set predict(newdata = mtcarsTest) %>% # Print results and forward arguments print %>% # Compute RMSE subtract(mtcarsTest %>% extract2('mpg')) %>% raise_to_power(2) %>% mean %>% sqrt 

This is probably a matter of taste, but personally I find it very useful.

As mentioned in the comments of @BondedDust, there are three possible ways to pass the %>% function. Using the spot placeholder, you can use LHS in a different position than the first (see Calling lm ).

+5
source share

You can use existing mutate_each or mutate_each to apply to all columns or select subset of columns

  library(dplyr) mtcars %>% filter(mpg > 20) %>% summarise_each(funs(max)) # mpg cyl disp hp drat wt qsec vs am gear carb #1 33.9 6 258 113 4.93 3.215 22.9 1 1 5 4 

Or passing an external function

  myFunc1 <- function(x) max(x) mtcars %>% filter(mpg > 20) %>% summarise_each(funs(myFunc1)) # mpg cyl disp hp drat wt qsec vs am gear carb #1 33.9 6 258 113 4.93 3.215 22.9 1 1 5 4 
+1
source share

All Articles