Steam graphical comparison of several distributions

Question

Steam graphical comparison of several distributions

This is an edited version of a previous question.

We are given an mn table of n observations (samples) over m variables (genes, etc.), and we strive to study the behavior of variables between each pair of observations - for example, two observations that have the highest positive or negative correlation. For this, I saw a large diagram in Stadler et.al. Paper of Nature (2011):

enter image description here

Here it can be a sample of the data set to be used.

m <- 1000 samples <- data.frame(unif1 = runif(m), unif2 = runif(m, 1, 2), norm1 = rnorm(m), norm2 = rnorm(m, 1), norm3 = rnorm(m, 0, 5))

I have already tested gpairs(samples) of the gpairs package that produces this one. This is a good start, but it is not possible to set the correlation coefficients in the upper right section, as well as the density graphs in the lower corner:

enter image description here

Then I used ggparis(samples, lower=list(continuous="density")) of the GGally package (Thanks @LucianoSelzer for the comment below). Now we have correlations in the upper corner and densities in the lower corner, but we do not have enough diagonal shading, and the density graphs do not have the form of a heat map.

enter image description here

Any ideas to make it closer to the desired image (first)?

+7

r ggally heatmap density-plot

Ali Mar 19 '13 at 17:51

source share

1 answer

Jouni helske · Accepted Answer · 2013-03-19T19:59:19+0000

You can try to combine several different construction methods and combine the results. Here is an example that could be modified accordingly:

 cors<-round(cor(samples),2) #correlations # make layout for plot layout laymat<-diag(1:5) #histograms laymat[upper.tri(laymat)]<-6:15 #correlations laymat[lower.tri(laymat)]<-16:25 #heatmaps layout(laymat) #define layout using laymat par(mar=c(2,2,2,2)) #define marginals etc. # Draw histograms, tweak arguments of hist to make nicer figures for(i in 1:5) hist(samples[,i],main=names(samples)[i]) # Write correlations to upper diagonal part of the graph # Again, tweak accordingly for(i in 1:4) for(j in (i+1):5){ plot(-1:1,-1:1, type = "n",xlab="",ylab="",xaxt="n",yaxt="n") text(x=0,y=0,labels=paste(cors[i,j]),cex=2) } # Plot heatmaps, here I use kde2d function for density estimation # image function for generating heatmaps library(MASS) for(i in 2:5) for(j in 1:(i-1)){ k <- kde2d(samples[,i],samples[,j]) image(k,col=heat.colors(1000)) }

edit: Fixed indexing in the last loop.

Steam graphical comparison of several distributions

More articles: