I donβt know how to get it in matrix form right away, but I find this solution useful:
dt[, {x = value; dt[, cor(x, value), by = group]}, by=group] group group V1 1: aa 1.0000000 2: ab 0.1556371 3: ba 0.1556371 4: bb 1.0000000
since you started with a molten dataset and you get a molten representation of the correlation.
Using this form, you can also simply calculate certain pairs, in particular, it is a waste of time calculating both diagonals. For example:
dt[, {x = value; g = group; dt[group <= g, list(cor(x, value)), by = group]}, by=group] group group V1 1: aa 1.0000000 2: ba 0.1556371 3: bb 1.0000000
Alternatively, this form works just as well as cross-correlation between two sets (i.e. diagonal from block)
library(data.table) set.seed(1) # reproducibility dt1 <- data.table(id=1:4, group=rep(letters[1:2], c(4,4)), value=rnorm(8)) dt2 <- data.table(id=1:4, group=rep(letters[3:4], c(4,4)), value=rnorm(8)) setkey(dt1, group) setkey(dt2, group) dt1[, {x = value; g = group; dt2[, list(cor(x, value)), by = group]}, by=group] group group V1 1: ac -0.39499814 2: ad 0.74234458 3: bc 0.96088312 4: bd 0.08016723
Obviously, if you ultimately want to get them in matrix form, you can use dcast or dcast.data.table , however, note that in the examples above you have two columns with the same name, to fix this, itβs worth renaming them in j-functions. For the original problem:
dcast.data.table(dt[, {x = value; g1=group; dt[, list(g1, g2=group, c =cor(x, value)), by = group]}, by=group], g1~g2, value.var = "c") g1 ab 1: a 1.0000000 0.1556371 2: b 0.1556371 1.0000000