I recently encountered the following grouped operation: for each group, values are assigned uniformly distributed numbers between -0.5 and 0.5, and if the group has only one element, it is assigned the value 0. For example, if I had the following observed groups:
g <- c("A", "A", "B", "B", "A", "C")
Then I would expect the assigned values:
outcome <- c(-0.5, 0, -0.5, 0.5, 0.5, 0)
Three observations in group A were assigned values of -0.5, 0 and 0.5 (in order), two observations in group B were assigned values of -0.5 and 0.5 (in order), and one observation in group C the assigned value was 0.
Usually, when I perform a grouped operation on one vector to get another vector, I use the ave function with the ave(data.vector, group.vector, FUN=function.to.apply.to.each.groups.data.vector.subset) form ave(data.vector, group.vector, FUN=function.to.apply.to.each.groups.data.vector.subset) However, in this operation, all I need to know is the number of members in the group, so there is no data.vector . As a result, I simply compiled a data vector, which I ignored when I called ave :
ave(rep(NA, length(g)), g, FUN=function(x) { if (length(x) == 1) { return(0) } else { return(seq(-0.5, 0.5, length=length(x))) } })
Although this gives me the correct answer, it is obvious that the need to compile a data vector, which I then ignore, is rather unsatisfactory. Is there a better way to assign values to groups when all that matters is the number of elements in the group?