How to extract values between adjacent variables in a correlation matrix in R?

Question

How to extract values between adjacent variables in a correlation matrix in R?

I have a huge correlation matrix, but here is just an example:

set.seed(1234) corrmat <- matrix(round (rnorm (36, 0, 0.3),2), ncol=6) rownames (corrmat) <- colnames (corrmat) <- c("A", "b1", "b2", "C", "L", "ctt") diag(corrmat) <- NA corrmat[upper.tri (corrmat)] <- NA A b1 b2 CL ctt A NA NA NA NA NA NA b1 0.08 NA NA NA NA NA b2 0.33 -0.17 NA NA NA NA C -0.70 -0.27 -0.03 NA NA NA L 0.13 -0.14 -0.15 -0.13 NA NA ctt 0.15 -0.30 -0.27 0.14 -0.28 NA > melt(corrmat) X1 X2 value 1 AA NA 2 b1 A 0.08 3 b2 A 0.33 4 CA -0.70 5 LA 0.13 6 ctt A 0.15 7 A b1 NA 8 b1 b1 NA 9 b2 b1 -0.17 10 C b1 -0.27 11 L b1 -0.14 12 ctt b1 -0.30 13 A b2 NA 14 b1 b2 NA 15 b2 b2 NA 16 C b2 -0.03 17 L b2 -0.15 18 ctt b2 -0.27 19 AC NA 20 b1 C NA 21 b2 C NA 22 CC NA 23 LC -0.13 24 ctt C 0.14 25 AL NA 26 b1 L NA 27 b2 L NA 28 CL NA 29 LL NA 30 ctt L -0.28 31 A ctt NA 32 b1 ctt NA 33 b2 ctt NA 34 C ctt NA 35 L ctt NA 36 ctt ctt NA

What I'm looking for - these are correlation values between neighboring ones - means that between A-b1, b1-b2, b2-C, CL, L-ctt (in the order in the column). I need to delete other values and NA. This is expected to be:

  X1 X2 value 2 b1 A 0.08 9 b2 b1 -0.17 16 C b2 -0.03 23 LC -0.13 30 ctt L -0.28

So they are in order A-b1-b2-CL-ctt .

Is there an easy way to filter it?

+4

filter matrix r

shNIL Aug 2 '12 at 20:24

source share

4 answers

Here is one way:

 n = rownames(corrmat) pair.table = data.frame(X1=head(n, -1), X2=tail(n, -1), value=diag(tail(corrmat, -1)))

Result:

 > pair.table X1 X2 value 1 A b1 0.08 2 b1 b2 -0.17 3 b2 C -0.03 4 CL -0.13 5 L ctt -0.28

+4

David robinson Aug 2 '12 at 20:36

source share

This is just 1 from the diagonal of the correlation matrix. So, all you have to do is just move the diagonal so that it is, and you are set up. Remove the first row and last column, and then just diag .

 corrmat <- corrmat[-1,-ncol(corrmat)] data.frame(X1 = rownames(corrmat), X2 = colnames(corrmat), r = diag(corrmat))

+2

John Aug 2 '12 at 20:51

source share

My solution is based on the generation of combinations (comb function) using rows / columns and “searching” records in a square distance matrix. SIF stands for simple interaction file.

 makeSIF <- function(x) { # args - # x - m*m distance or correlation matrix # @returns data frame in SIF format # sif <- as.data.frame(t(combn(as.character(rownames(x)), 2))) #print(sif) weight <- apply(sif, 1, indexDMatFromLookup, x) sif2 <- data.frame(sif, weight) return(sif2) } indexDMatFromLookup <- function(lookup, x) { return(indexDMat(x, lookup[1], lookup[2])) } indexDMat <- function(x, i1,i2) { return(x[i1,i2]) }

Seeing other answers is probably much slower.

edit: this is really not so bad.

system.time (replicate (1000, makeSIF (corrmat)))

user system has expired

0.976 0.900 0.975

system.time (replicate (1000, data.frame (X1 = head (n, -1), X2 = tail (n, -1), value = diag (tail (corrmat, -1)))))

user system has expired

0.656 0.000 0.658

only a split second slower than the john method.

+1

zzk Aug 2 '12 at 20:52

source share

Gavin simpson · Accepted Answer · 2012-08-02T20:44:43+0000

Here is one way to use the often overlooked row() and col() functions

 > corrmat ## my version as there was no set.seed A b1 b2 CL ctt A NA NA NA NA NA NA b1 0.03 NA NA NA NA NA b2 -0.41 -0.02 NA NA NA NA C 0.11 0.61 -0.18 NA NA NA L -0.28 -0.28 0.39 0.01 NA NA ctt -0.21 -0.41 -0.55 0.34 -0.13 NA > corrmat[row(corrmat) == col(corrmat) + 1] [1] 0.03 -0.02 -0.18 0.01 -0.13

Note that we are indexing the corrmat matrix as a vector here, and a bit in brackets refers to the elements to be returned, where the row index of each element corresponds to the column index of each element plus 1. Using -1 will give you super-diagonal (i.e. above the diagonal).

Combine all of this:

 out <- data.frame(X1 = rownames(corrmat)[-1], X2 = head(colnames(corrmat), -1), Value = corrmat[row(corrmat) == col(corrmat) + 1]) > out X1 X2 Value 1 b1 A 0.03 2 b2 b1 -0.02 3 C b2 -0.18 4 LC 0.01 5 ctt L -0.13

How to extract values ​​between adjacent variables in a correlation matrix in R?

More articles:

How to extract values between adjacent variables in a correlation matrix in R?