I try to combine factor levels in data.tableand wonder if there is a data.table-y way to do this.
Example:
DT = data.table(id = 1:20, ind = as.factor(sample(8, 20, replace = TRUE)))
I want to say that types 1,3,8 are in group A; 2 and 4 are in group B; and 5,6,7 are in group C.
Here is what I did, which was pretty slow in the full version of the problem:
DT[ind %in% c(1, 3, 8), grp := as.factor("A")]
DT[ind %in% c(2, 4), grp := as.factor("B")]
DT[ind %in% c(5, 6, 7), grp := as.factor("C")]
Another approach suggested by this related question could be translated as follows:
DT[ , grp := ind]
levels(DT$grp) = c("A", "B", "A", "B", "C", "C", "C", "A")
Or maybe (given that I have 65 base groups and 18 aggregated groups, this seems a bit neat)
DT[ , grp := ind]
lev <- letters(1:8)
lev[c(1, 3, 8)] <- "A"
lev[c(2, 4)] <- "B"
lev[5:7] <- "C"
levels(DT$grp) <- lev
Both of these seem bulky; Does this seem like an appropriate way to do this in data.table?
10 000 000 /. ( ), - , - . .
(Keying DT , , )