Here is a simple example illustrating the problem:
library(data.table) dt = data.table(a = c(1,1,2,2), b = 1:2) dt[, c := cumsum(a), by = b][, d := cumsum(a), by = c] # abcd #1: 1 1 1 1 #2: 1 2 1 2 #3: 2 1 3 2 #4: 2 2 3 4
Trying to do the same in dplyr I fail because the first group_by is constant and the grouping is done with both b and c :
df = data.frame(a = c(1,1,2,2), b = 1:2) df %.% group_by(b) %.% mutate(c = cumsum(a)) %.% group_by(c) %.% mutate(d = cumsum(a))
Is this a bug or function? If this is a function, then how can the data.table solution be replicated in a single expression?
source share