Implementing an aggregate function in dmapply (ddR package)

I would like to run the aggregate function in dmapply , as suggested through ddR .

Desired Results

The desired results reflect the simple output generated by aggregate in the database:

 aggregate( x = mtcars$mpg, FUN = function(x) { mean(x, na.rm = TRUE) }, by = list(trans = mtcars$am) ) 

which produces:

  trans x 1 0 17.14737 2 1 24.39231 

Attempt - ddmapply

I would like to get the same results when using ddmapply , as shown below:

 # ddR require(ddR) # ddR object creation distMtcars <- as.dframe(mtcars) # Aggregate / ddmapply dmapply( FUN = function(x, y) { aggregate(FUN = mean(x, na.rm = TRUE), x = x, by = list(trans = y)) }, distMtcars$mpg, y = distMtcars$am, output.type = "dframe", combine = "rbind" ) 

Code Failure:

Error in match.fun(FUN) : 'mean(x, na.rm = TRUE)' not a function, character or character Called from: match.fun(FUN)


Updates

The correction error noted by @Mike resolves this error, however, does not give the desired result. Code:

 # Avoid namespace conflict with other packages ddR::collect( dmapply( FUN = function(x, y) { aggregate( FUN = function(x) { mean(x, na.rm = TRUE) }, x = x, by = list(trans = y) ) }, distMtcars$mpg, y = distMtcars$am, output.type = "dframe", combine = "rbind" ) ) 

gives:

 [1] trans x <0 rows> (or 0-length row.names) 
+5
source share
1 answer

It works great for me if you change your aggregate function to match the one you called before: FUN = function(x) mean(x, na.rm = T) . The reason he cannot find mean(x, na.rm = T) is because he is not a function (this is a function call), and mean is a function.

It will also give you NA results if you don't change your x = distMtcars$mpg to x = collect(distMtcars)$mpg . The same goes for y. With all that said, I think this should work for you:

 res <-dmapply( FUN = function(x, y) { aggregate(FUN = function(x) mean(x, na.rm = TRUE), x = x, by = list(trans = y)) }, x = list(collect(distMtcars)$mpg), y = list(collect(distMtcars)$am), output.type = "dframe", combine = "rbind" ) 

Then you can do collect(res) to see the result.

 collect(res) # trans x #1 0 17.14737 #2 1 24.39231 
+2
source

All Articles