Heatmaps in R using ggplot function - how are clustered rows?

I am currently generating heatmaps in R using the ggplot function. In the code below. First, I read the data in the data framework, deleted any duplicate rows, changed the timestamp of the field, melted the data frame (according to the โ€œtimestampโ€), scaled the entire variable from 0 to 1, and then drew a heat map.

In the resulting heatmap, time is displayed along the x axis, and each iostat-sda variable (see examples of data below) is displayed along the y axis. Note. If you want to try the R code, you can paste the example data below into a file called iostat-sda.csv.

however, I really need to be able to group the rows in this heatmap ... does anyone know how this can be achieved with the ggplot function?

Any help would be greatly appreciated!

############################## The code library(ggplot2) fileToAnalyse_f <- read.csv(file="iostat-sda.csv",head=TRUE,sep=",") fileToAnalyse <- subset(fileToAnalyse, !duplicated(timestamp)) fileToAnalyse[,1]<-factor(fileToAnalyse[,1]) fileToAnalyse.m <- melt(fileToAnalyse, id=1) fileToAnalyse.s <- ddply(fileToAnalyse.m, .(variable), transform, rescale = rescale(value) ) #scales each variable between 0 and 1 base_size <- 9 ggplot(fileToAnalyse.s, aes(timestamp, variable)) + geom_tile(aes(fill = rescale), colour = "black") + scale_fill_gradient(low = "black", high = "white") + theme_grey(base_size = base_size) + labs(x = "Time", y = "") + opts(title = paste("Heatmap"),legend.position = "right", axis.text.x = theme_blank(), axis.ticks = theme_blank()) + scale_y_discrete(expand = c(0, 0)) + scale_x_discrete(expand = c(0, 0)) ########################## Sample data from iostat-sda.csv timestamp,DSKRRQM,DSKWRQM,DSKR,DSKW,DSKRMB,DSKWMB,DSKARQS,DSKAQUS,DSKAWAIT,DSKSVCTM,DSKUtil 1319204905,0.33,0.98,10.35,2.37,0.72,0.02,120.00,0.01,0.40,0.31,0.39 1319204906,1.00,4841.00,682.00,489.00,60.09,40.68,176.23,2.91,2.42,0.50,59.00 1319204907,0.00,1600.00,293.00,192.00,32.64,13.89,196.45,5.48,10.76,2.04,99.00 1319204908,0.00,3309.00,1807.00,304.00,217.39,26.82,236.93,4.84,2.41,0.45,96.00 1319204909,0.00,5110.00,93.00,427.00,0.72,43.31,173.43,4.43,8.67,1.90,99.00 1319204910,0.00,6345.00,115.00,496.00,0.96,52.25,178.34,4.00,6.32,1.62,99.00 1319204911,0.00,6793.00,129.00,666.00,1.33,57.22,150.83,4.74,6.16,1.26,100.00 1319204912,0.00,6444.00,115.00,500.00,0.93,53.06,179.77,4.20,6.83,1.58,97.00 1319204913,0.00,1923.00,835.00,215.00,78.45,16.68,185.55,4.81,4.58,0.91,96.00 1319204914,0.00,0.00,788.00,0.00,83.51,0.00,217.04,0.45,0.57,0.25,20.00 1319204915,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 1319204916,0.00,4.00,2.00,4.00,0.01,0.04,17.67,0.00,0.00,0.00,0.00 1319204917,0.00,8.00,4.00,8.00,0.02,0.09,17.83,0.00,0.00,0.00,0.00 1319204918,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00,0.00 1319204919,0.00,2.00,113.00,4.00,11.96,0.03,209.93,0.06,0.51,0.43,5.00 1319204920,0.00,59.00,147.00,54.00,11.15,0.63,120.02,0.04,0.20,0.15,3.00 1319204921,1.00,19.00,57.00,18.00,4.68,0.20,133.47,0.07,0.93,0.67,5.00 
+4
source share
1 answer

There is a beautiful package called NeatMap that makes it easy to create heatmaps in ggplot2. Some of the row clustering methods include multidimensional scaling, PCA, or hierarchical clustering. Beware:

  • Data in make.heatmap1 must be in wide format
  • Data should be a matrix, not a file frame
  • Assign outlet names to the wide format matrix before plotting

I modified my code a bit to avoid naming variables like basic functions (i.e. rescaling)

 fileToAnalyse.s <- ddply(fileToAnalyse.m, .(variable), transform, rescale.x = rescale(value) ) #scales each variable between 0 and 1 fileToAnalyse.w <- dcast(fileToAnalyse.s, timestamp ~ variable, value_var="rescale.x") rownames(fileToAnalyse.w) <- as.character(fileToAnalyse.w[, 1]) ggheatmap <- make.heatmap1(as.matrix(fileToAnalyse.w[, -1]), row.method = "complete.linkage", row.metric="euclidean", column.cluster.method ="none", row.labels = rownames(fileToAnalyse.w)) +scale_fill_gradient(low = "black", high = "white") + labs(x = "Time", y = "") + opts(title = paste("Heatmap") 
+6
source

All Articles