Overlap matrix in R

Question

Overlap matrix in R

I have the following data frame

id channel 1 a 1 b 1 c 2 a 2 c 3 a

I would like to create and overlay a matrix. This is basically a square matrix with row and column labels like a, b, c with each entry in this table showing how many identifiers are common for each channel. For example, in the example above, the matrix would look like

  abc a 3 1 2 b 1 1 1 c 2 1 2

Thank you very much in advance.

+7

r

broccoli Jul 31 '12 at 18:50

source share

2 answers

 library(plyr) df id channel 1 1 a 2 1 b 3 1 c 4 2 a 5 2 c 6 3 a tb <- table(ddply(df, .(id), function(x) {x$id <- x$channel; expand.grid(x)})) tb channel id abc a 3 1 2 b 1 1 1 c 2 1 2 names(dimnames(tb)) <- NULL tb abc a 3 1 2 b 1 1 1 c 2 1 2

Now some explanation and something about matrix tables as table() output. In ?table

there is an example:

 a <- letters[1:3] (b <- sample(a)) [1] "b" "c" "a" table(a, b) b aabc a 0 1 0 b 0 0 1 c 1 0 0

Thus, it corresponds to elements by position. Now if we have

  id channel 1 a 1 b 1 c 2 a ...

Then sharing the same id can be shown by splitting the data frame by id , creating a copy of the channel column and getting all the combinations of these two columns:

 tbl <- expand.grid(data.frame(x = c("a","b","c"), y = c("a", "b", "c"))) tbl xy 1 aa 2 ba 3 ca 4 ab 5 bb 6 cb 7 ac 8 bc 9 cc table(tbl$x, tbl$y) abc a 1 1 1 b 1 1 1 c 1 1 1

+5

Julius Jul 31 '12 at 19:25

source share

Josh o'brien · Accepted Answer · 2012-07-31T19:16:55+0000

This should do the trick:

 df <- data.frame(id=c(1,1,1,2,2,3), channel=letters[c(1,2,3,1,3,1)]) # your data m <- table(df[[1]], df[[2]]) ## Alternatively: m <- do.call(table, df) t(m) %*% m # abc # a 3 1 2 # b 1 1 1 # c 2 1 2

Overlap matrix in R

More articles: