R tapply with null function

Question

R tapply with null function

I am having trouble understanding the tapply function when the FUN argument is null .

The documentation says:

If FUN is NULL, tapply returns a vector that can be used to index a multi-channel array, which tapply normally returns.

For example, what does the following sample documentation do?

 ind <- list(c(1, 2, 2), c("A", "A", "B")) tapply(1:3, ind) #-> the split vector

I do not understand the results:

 [1] 1 2 4

Thanks.

+5

r tapply

carmellose May 23 '16 at 12:39

source share

1 answer

Iaroslav domin · Answer 1 · 2016-05-23T13:09:23+0000

If you run tapply with the specified function (not NULL), say sum , as in the help, you will see that the result is a two-dimensional array with NA in one cell:

 res <- tapply(1:3, ind, sum) res AB 1 1 NA 2 2 3

This means that one combination of factors, namely (1, B), is absent. When FUN is NULL, it returns vector indices corresponding to all combinations of factors. To check this:

 > which(!is.na(res)) [1] 1 2 4

It should be noted that this function can return NA, as in the following toy example:

 > f <- function(x){ if(x[[1]] == 1) return(NA) return(sum(x)) } > tapply(1:3, ind, f) AB 1 NA NA 2 2 3

Thus, in the general case, NA does not mean that the factor combination is absent.

R tapply with null function

More articles: