I have data that contain some NA values ββin their elements. What I want to do is to perform clustering without deleting the rows where the HC is present.
I understand that a gower measure of distance in daisy allows such a situation. But why is my code below not working? I welcome alternatives other than chamomile.
# plot heat map with dendogram together. library("gplots") library("cluster")
I got an error message:
Error in which(is.na) : argument to 'which' is not logical Calls: distfunc.g -> daisy In addition: Warning messages: 1: In data.matrix(x) : NAs introduced by coercion 2: In data.matrix(x) : NAs introduced by coercion 3: In daisy(x, metric = "gower") : binary variable(s) 8, 9 treated as interval scaled Execution halted
At the end of the day, I would like to perform hierarchical clustering with allowed NA data.
Update
Conversion using as.numeric works with the above example. But why did this code fail when reading from a text file?
library("gplots") library("cluster") # This time read from file mtcars <- read.table("http://dpaste.com/1496666/plain/",na.strings="NA",sep="\t") # Following suggestion convert to numeric mydata <- apply( mtcars, 2, as.numeric ) hclustfunc <- function(x) hclust(x, method="complete") #distfunc <- function(x) dist(x,method="euclidean") # Try using daisy GOWER function distfunc <- function(x) daisy(x,metric="gower") d <- distfunc(mydata) fit <- hclustfunc(d) heatmap.2(as.matrix(mydata),dendrogram="row",trace="none", margin=c(8,9), hclust=hclustfunc,distfun=distfunc);
The error I am getting is this:
Warning messages: 1: In min(x) : no non-missing arguments to min; returning Inf 2: In max(x) : no non-missing arguments to max; returning -Inf 3: In min(x) : no non-missing arguments to min; returning Inf 4: In max(x) : no non-missing arguments to max; returning -Inf Error in hclust(x, method = "complete") : NA/NaN/Inf in foreign function call (arg 11) Calls: hclustfunc -> hclust Execution halted
~
r cluster-analysis bioconductor
neversaint
source share