Re-mutate a variable using dplyr and purrr

I am self-taught in R, and this is my first StackOverflow question. I apologize if this is an obvious problem; please.

Short version of my question
I wrote a custom function to calculate the percentage change in a variable year after year. I would like to use the purrr map_at function to apply my custom function to a variable name vector. My custom function works when applied to a single variable, but fails when I bind it using map_a

My custom function

 calculate_delta <- function(df, col) { #generate variable name newcolname = paste("d", col, sep="") #get formula for first difference. calculate_diff <- lazyeval::interp(~(a + lag(a))/a, a = as.name(col)) #pass formula to mutate, name new variable the columname generated above df %>% mutate_(.dots = setNames(list(calculate_diff), newcolname)) } 

When I apply this function to a single variable in the mtcars dataset, the output will be as expected (although, obviously, the value of the result is insensitive).

 calculate_delta(mtcars, "wt") 

Attempting to apply a function to a character vector with Purrr

I think I'm having problems with the concept of how map_at passes arguments to a function. All examples of fragments that I can find on the Internet use map_at with functions like is.character that do not require additional arguments. Here are my attempts to apply the function with purrr .

 vars <- c("wt", "mpg") mtcars %>% map_at(vars, calculate_delta) 

It gives me this error message

Error in the paste ("d", col, sep = "): the argument" col "is missing, without a default value

I assume this is because map_at passes vars as df , not pass argument for col . To get around this problem, I tried the following:

 vars <- c("wt", "mpg") mtcars %>% map_at(vars, calculate_delta, df = .) 

This causes me this error:

 Error: unrecognised index type 

I teamed up with a bunch of different versions, including removing the df argument from the calculate_delta function, but I had no luck.

Other potential solutions

1) Version of this using sapply , not purrr . I tried to solve the problem this way and had similar problems. And my goal is to figure out a way to do this using purrr, if possible. Based on my understanding of purrr , this seems like a typical use case.

2) I obviously can think about how to implement this with a for loop, but I try to avoid this if it is possible for similar reasons.

Clearly, I think about it wrong. Please, help!

EDIT 1

To clarify, I wonder if there is a multiple variable conversion method that does two things.

1) Generates new variables in the original tbl_df without replacing the replacement mutated columns (as in the case of using dplyr mutate_at ).

2) Automatically creates new variable labels.

3) If possible, follow the steps above with a single function using map_at .

It may not be possible, but I feel that there must be an elegant way to accomplish what I am describing.

+5
source share
1 answer

Try to simplify the process:

 delta <- function(x) (x + dplyr::lag(x)) /x cols <- c("wt", "mpg") #This library(dplyr) mtcars %>% mutate_at(cols, delta) #Or library(purrr) mtcars %>% map_at(cols, delta) #If necessary, in a function f <- function(df, cols) { df %>% mutate_at(cols, delta) } f(iris, c("Sepal.Width", "Petal.Length")) f(mtcars, c("wt", "mpg")) 

Edit

If you want to embed new names after, we can write a custom function ready to work:

 Rename <- function(object, old, new) { names(object)[names(object) %in% old] <- new object } mtcars %>% mutate_at(cols, delta) %>% Rename(cols, paste0("lagged",cols)) 

If you want to rename the resulting lagging variables:

 mtcars %>% mutate_at(cols, funs(lagged = delta)) 
+9
source

All Articles