Using ddply inside a function

I am trying to make a function using ddply inside it. However, I can’t work. This is a dummy example that reproduces what I get. Does this have anything to make this mistake ?

library(ggplot2) data(diamonds) foo <- function(data, fac1, fac2, bar) { res <- ddply(data, .(fac1, fac2), mean(bar)) res } foo(diamonds, "color", "cut", "price") 
+5
source share
2 answers

I do not think this is a mistake. ddply expects the name of a function that you really did not supply with mean(bar) . You need to write a complete function that calculates the average value that you would like:

 foo <- function(data, fac1, fac2, bar) { res <- ddply(data, c(fac1, fac2), function(x,ind){ mean(x[,ind]},bar) res } 

Also, you shouldn't pass strings to .() , So I changed this to c() so that you can pass arguments to the function directly to ddply .

+10
source

There are many things in the code, but the main problem is this: you pass the column names as character strings.

Just doing "find-and-replace" with your parameters inside the function gives:

 res <- ddply(diamonds, .("color", "cut"), mean("price")) 

If you understand how ddply works (I doubt it, given the rest of the code), you will realize that this should not work: ignoring the error in the last part (function), this (note the absence of quotes: notation. () - nothing more than a way to provide quotes):

 res <- ddply(diamonds, .(color, cut), mean(price)) 

Fortunately, ddply also supports passing its second argument as a vector of characters, i.e. column names, so (ignoring the problems with the last parameter again) this should become:

 foo <- function(data, facs, bar) { res <- ddply(data, facs, mean(bar)) res } foo(diamonds, c("color", "cut"), "price") 

Finally: the function you pass to ddply should be a function that takes data.frame as the first argument, which will each time hold the part of you passed in via data.frame (diamonds) for the current color and cut values. mean("price") or mean(price) no. If you insist on using ddply , here is what you need to do:

 foo <- function(data, facs, bar) { res <- ddply(data, facs, function(dfr, colnm){mean(dfr[,colnm])}, bar) res } foo(diamonds, c("color", "cut"), "price") 
+10
source

All Articles