Change the color of the sheet in plot.dendrogram as with plot.phylo of the ape package

I am trying to build the result of agglomerative clustering (UPGMA with Agnes) in the same style as when building a tree using the ape package. A simple example that I include in the image below Figure 1. A simple example of the final output required

The key problem is that I want to be able to color the leaves of the dendrogram based on the template in the leaf labels. I tried two approaches: either I used hc2Newick , or I used the code of Joris Mayes, as suggested in response to Change Dendrogram leaves . Both did not give a satisfactory result. Perhaps I do not quite understand how the dendrograms work. Saving the ASCII object abundance.agnes.ave (saved from the agnes launch) can be found at https://www.dropbox.com/s/gke9qnvwptltkky/abundance.agnes.ave .

When I use the first option (with hc2Newick from the bioconductor ctc ), I get the following figure when using this code:

 write(hc2Newick(as.hclust(abundance.agnes.ave)),file="all_samples_euclidean.tre") eucltree<-read.tree(file="all_samples_euclidean.tre") eucltree.laz<-ladderize(eucltree,FALSE) tiplabs<-eucltree$tip.label numbertiplabs<-length(tiplabs) colourtips<-rep("green",numbertiplabs) colourtips[grep("II",tiplabs)]<-"red" plot(eucltree.laz,tip.color=colourtips,adj=1,cex=0.6,use.edge.length=F) add.scale.bar() 

Using plot.phylo

This is obviously not perfect, the “alignment” of the plot is not the way I wanted. I believe this is related to calculating the length of the branches, but I have no vague idea how to solve this problem. Of course, compared to the results of the colLab function, which are more reminiscent of the dendrogram style that I would like to report. Also, using use.edge.length=T in the above code gives clustering that is not correctly aligned: Plot.phylo with branch length

The second approach using the colLab function of Joris Meys with the following code yields the following digit

 clusDendro<-as.dendrogram(as.hclust(abundance.agnes.ave)) labelColors<-c("red","green") clusMember<-rep(1,length(rownames(abundance.x))) clusMember[grep("II",rownames(abundance.x))]<-2 names(clusMember)<-rownames(abundance.x) colLab <- function(n) { if(is.leaf(n)) { a <- attributes(n) # clusMember - a vector designating leaf grouping # labelColors - a vector of colors for the above grouping labCol <- labelColors[clusMember[which(names(clusMember) == a$label)]] attr(n, "nodePar") <- c(a$nodePar, lab.col = labCol) } n } clusDendro<-dendrapply(clusDendro, colLab) plot(clusDendro,horiz=T,axes=F) 

Using colLab This plot is close to what I want, but I don’t know why open circles appear on the leaves and how to remove them.

Any help is greatly appreciated.

Yours faithfully,

FM

+4
source share
2 answers

I wrote this code quite recently, and it seems that something in the mechanism has changed a bit.

The plot.dendrogram function that I used has a nodePar argument. The behavior has changed since the last time I used this function, and although it is usually used for internal nodes, this also seems to affect external nodes. The default value for pch now 1:2 , according to the help files.

Therefore, you need to specifically specify pch=NA in the attributes that you add to external nodes in the colLab function. Try to adapt it like this:

 colLab <- function(n) { if(is.leaf(n)) { a <- attributes(n) # clusMember - a vector designating leaf grouping # labelColors - a vector of colors for the above grouping labCol <- labelColors[clusMember[which(names(clusMember) == a$label)]] attr(n, "nodePar") <- if(is.list(a$nodePar)) c(a$nodePar, lab.col = labCol,pch=NA) else list(lab.col = labCol,pch=NA) } n } 

On my machine, this solves the problem.

Alternatively, you can look at the use.edge.length argument to the use.edge.length function in the ape package. You set it to FALSE , but from your explanation, I believe that you want it to be set to TRUE by default.

EDIT: to make the function more general, it might be a good idea to add labelColors and clusMember as arguments to the function. My quick n-dirty solution is not the best example of clean code ...

Also forget what I said about using edge length. the ape package interprets it as a real dendrogram, and use.edge.length to TRUE converts the lengths of the edges during evolution. Hence the "strange" presentation of the dendrogram.

Also note that if the tree structures do not have the nodePar attribute, adding additional parameters using the c() function will lead to undesirable effects: if you add, for example, lab.cex=0.6 , the c() function will create a vector instead of a list and convert the value for lab.cex to a character when there is a character value in the parameter list. In this case, it will be the name of the color, and this explains the error you are talking about in the comment.

0
source

This functionality is now available in a new package called " dendextend ", built specifically for this kind of thing.

You can see many examples in the presentations and vignettes of the package in the "Use" section at the following URL: https://github.com/talgalili/dendextend

The answer to an almost exact question was given only in the following SO question:

fooobar.com/questions/329656 / ...

+2
source

All Articles