The problem (and reasoning) is related to the fact that the aggregated value is not easily assigned.
It’s easier to see this in action if you look at a data table with more columns than just the ones used for the calculation.
Please note that if we just want to output the calculated value, then the expression on the RHS , as you have it, is just fine.
In other words, data is multiplied to return only unique values.
However, if you want to save this value back to the SAME data table (what happens when using the := operator), then all rows identified in i (all rows by defualt) are assigned a value. (which when you look at the output with extra columns makes sense)
Then copying this data.table to agg is still sent through all the rows.
Therefore, if you want to copy to a new table only those rows from your original table that are unique , you can
a. wrap the original table inside `unique()` before assigning it b. assign the table, above, that is returned when you are not assigning the RHS output (which is what @Arun suggested)
Example a. will be:
agg2 <- unique(dtb[, value := mean (value), by = list (month, fac)])
The following example may help illustrate.
(You will need to copy + paste this, as the output is omitted)
# SAMPLE DATA, as above library(data.table) dtb.bak <- data.table (expand.grid (month = rep (month.abb[1:3], each = 3), fac = letters[1:3]), value = rnorm (27))