Assign values to groups when all that matters is the number of group members

Question

Assign values to groups when all that matters is the number of group members

I recently encountered the following grouped operation: for each group, values are assigned uniformly distributed numbers between -0.5 and 0.5, and if the group has only one element, it is assigned the value 0. For example, if I had the following observed groups:

g <- c("A", "A", "B", "B", "A", "C")

Then I would expect the assigned values:

 outcome <- c(-0.5, 0, -0.5, 0.5, 0.5, 0)

Three observations in group A were assigned values of -0.5, 0 and 0.5 (in order), two observations in group B were assigned values of -0.5 and 0.5 (in order), and one observation in group C the assigned value was 0.

Usually, when I perform a grouped operation on one vector to get another vector, I use the ave function with the ave(data.vector, group.vector, FUN=function.to.apply.to.each.groups.data.vector.subset) form ave(data.vector, group.vector, FUN=function.to.apply.to.each.groups.data.vector.subset) However, in this operation, all I need to know is the number of members in the group, so there is no data.vector . As a result, I simply compiled a data vector, which I ignored when I called ave :

 ave(rep(NA, length(g)), g, FUN=function(x) { if (length(x) == 1) { return(0) } else { return(seq(-0.5, 0.5, length=length(x))) } }) # [1] -0.5 0.0 -0.5 0.5 0.5 0.0

Although this gives me the correct answer, it is obvious that the need to compile a data vector, which I then ignore, is rather unsatisfactory. Is there a better way to assign values to groups when all that matters is the number of elements in the group?

+5

r

josliber May 31 '15 at 2:59 pm

source share

1 answer

josliber · Accepted Answer · 2015-05-31T19:26:58+0000

From the comments it does not seem that there is a version of ave that accepts only a group and a function called with the number of elements in each group. I suppose this is not particularly surprising since it is a rather specialized operation.

If I had to do this often, I could roll my own version of ave with the desired properties in the form of a thin shell around ave :

 ave.len <- function(..., FUN) { l <- list(...) do.call("ave", c(list(x=rep(NA, length(l[[1]]))), l, FUN=function(x) FUN(length(x)))) } # Original operation, using @akrun 1-line command for sequences g <- c("A", "A", "B", "B", "A", "C") ave.len(g, FUN=function(n) seq(-0.5, 0.5, length=n)* (n!=1)+0L) # [1] -0.5 0.0 -0.5 0.5 0.5 0.0 # Group of size n has the n^th letter in the alphabet ave.len(g, FUN=function(n) rep(letters[n], n)) # [1] "c" "c" "b" "b" "c" "a" # Multiple groups via the ... argument (here everything in own group) ave.len(g, 1:6, FUN=function(n) rep(letters[n], n)) # [1] "a" "a" "a" "a" "a" "a"

Assign values ​​to groups when all that matters is the number of group members

More articles:

Assign values to groups when all that matters is the number of group members