R: a way to control a microarray data matrix when rows and columns have additional annotation tables

Data

I often work with data like microarray. It has a number matrix and annotations for columns and rows. You can think of columns as faces, of rows as genes and matrices as some measure for each gene-individual pair. Here is a smaller simulated version: (real data can have millions of rows).

p <- 500000
N <- 50
mat <- matrix(rnorm(p*N), ncol=N)
colData <- replicate(10, sample(letters[sample(26, 4)], N, replace=TRUE))
colnames(colData) <- toupper(letters[1:10])
rowData <- data.frame("chromosome"=rep(c("chr1","chr2"), rep(p/2,2)),
                      "coordinates"=rep(1:(p/2), 2),
                      "someScore"=round(runif(p, max=10))
                     )

Thus, each row in rowInfo has an annotation for the gene and each column in colInfo - an annotation for an individual (I think: gender, age, etc.).

My user approach is to combine all the information into one list:

theData <- list(mat=mat, rowInfo=rowData, colInfo=colData)
rm(mat, rowData, colData)

And use custom functions to manage this data.

Use cases

:

  • ( / ) (.. ).
  • ( ) .
  • / ( rowMeans /colMeans )

, : :

  • , "colData".
  • cols , "rowData".

. , . .

, :

  • Bioconductor. - , bioconductor . .

  • tidyr/plyr. - "" . - - . .

  • dplyr. - , , , . .

  • data.table. - . , , .

  • . - (apply, tapply ..) - , . , .

split-apply-comb .

, , , .

, .

+4
1

, , dplyr data.table , . , data.frame.

dplyr. data.table , . dplyr , .

" " " -". ( .)

-1

All Articles