Dplyr :: n () returns "Error: this function cannot be called directly"

If I do this:

dplyr::mutate(MeanValue = mean(RSSI), ReadCount = n()) 

everything is working fine. But when I try to qualify a function:

 dplyr::mutate(MeanValue = mean(RSSI), ReadCount = dplyr::n()) 

I get the error indicated in the header.

So, I really have no problems, I just can avoid it, but I wonder why this is happening. I already looked at another question ( dplyr: "Error in n (): function should not be called directly" ), but as far as I know, dplyr is the only library that I use. I tried to do what the answer suggests, but

 detach(package:plyr) 

leads to

Error in disconnecting (package: plyr): invalid argument 'name' and

 conflicts() 

doesn't mention n ():

[1] "filter" "lag" "body <-" "intersect" "kronecker" "setdiff" "setequal" "union"
Most of which is the cause of dplyr.

I think I'm not the only one who was embarrassed by this?

+14
r dplyr
source share
3 answers

So, I have no problem, I can just avoid [writing dplyr::n() ], but I wonder why this is happening.

Here is the source code for dplyr::n in dplyr 0.5.0:

 function () { stop("This function should not be called directly") } 

This is why a fully qualified form raises this error: a function always returns an error. (My assumption is that the throw-throwing dplyr::n function exists so that n() can have a typical documentation page with examples.)

Inside the filter / mutate / summarise statements, n() does not call this function. Instead, some internal function calculates the group sizes for the expression n() . This is why the following when dplyr does not load:

 n() #> Error: could not find function "n" library(magrittr) iris %>% dplyr::group_by(Species) %>% dplyr::summarise(n = n()) #> # A tibble: 3 Γ— 2 #> Species n #> <fctr> <int> #> 1 setosa 50 #> 2 versicolor 50 #> 3 virginica 50 

Here n() cannot be mapped to a function, so we get an error. But when it is used inside the dplyr verb, n() does display something and returns group sizes.

+16
source share

I know that I was 2 years late, but here is my opinion.

Grouping in dplyr actually does nothing with data. He just notes his grouped. This means that functions such as mean or n should be aware of this and should, from their wider context, infer that they must perform their calculations in a group. They are not real functions that are not aware of this context. Basically, these are the characters that summaze () or mutate () choose to evaluate in a certain way (means or counts on a group). I think Hadley decided to show the error if you call n () directly, as this is slightly better than not implementing the function at all.

+2
source share

I think this happens as a result of masking between plyr and dplyr. Anyway, this solves this:

 dplyr::summarise(count = n()) 
+2
source share

All Articles