Is there a way to create a "star" plot using ggplot?

I am trying to (partially) reproduce the cluster graph available via t s.class(...) in ade4 package with ggplot , but this question is actually much more general.

NB: This question refers to "starry plots," but only the spider graphics are actually discussed.

 df <- mtcars[,c(1,3,4,5,6,7)] pca <-prcomp(df, scale.=T, retx=T) scores <-data.frame(pca$x) library(ade4) km <- kmeans(df,centers=3) plot.df <- cbind(scores$PC1, scores$PC2) s.class(plot.df, factor(km$cluster)) 

An important feature I'm looking for is a "star", for example. a set of lines starting from a common point (here, the centroids of the cluster), to a number of other points (here are points in the cluster).

Is there a way to do this with the ggplot package? If not directly through ggplot , then does anyone know about an add-in that works. For example, there are several stat_ellipse(...) options that are not part of the ggplot package ( here , but here ).

+6
source share
2 answers

The difficulty here is to create the data, not the plot itself. You must go through the package code and extract from it what is useful to you. This should be a good start:

enter image description here

 dfxy <- plot.df df <- data.frame(dfxy) x <- df[, 1] y <- df[, 2] fac <- factor(km$cluster) f1 <- function(cl) { n <- length(cl) cl <- as.factor(cl) x <- matrix(0, n, length(levels(cl))) x[(1:n) + n * (unclass(cl) - 1)] <- 1 dimnames(x) <- list(names(cl), levels(cl)) data.frame(x) } wt = rep(1, length(fac)) dfdistri <- f1(fac) * wt w1 <- unlist(lapply(dfdistri, sum)) dfdistri <- t(t(dfdistri)/w1) ## create a data.frame cstar=2 ll <- lapply(seq_len(ncol(dfdistri)),function(i){ z1 <- dfdistri[,i] z <- z1[z1>0] x <- x[z1>0] y <- y[z1>0] z <- z/sum(z) x1 <- sum(x * z) y1 <- sum(y * z) hx <- cstar * (x - x1) hy <- cstar * (y - y1) dat <- data.frame(x=x1, y=y1, xend=x1 + hx, yend=y1 + hy,center=factor(i)) }) dat <- do.call(rbind,ll) library(ggplot2) ggplot(dat,aes(x=x,y=y))+ geom_point(aes(shape=center)) + geom_segment(aes(yend=yend,xend=xend,color=center,group=center)) 
+3
source

This answer is based on @agstudy's answer and suggestions made in @Henrik's comment. Posting because it is shorter and more directly applicable to the issue.

The bottom line is this: star charts are easy to execute with ggplot using geom_segment(...) . Using df, pca, score and km from the question:

 # build ggplot dataframe with points (x,y) and corresponding groups (cluster) gg <- data.frame(cluster=factor(km$cluster), x=scores$PC1, y=scores$PC2) # calculate group centroid locations centroids <- aggregate(cbind(x,y)~cluster,data=gg,mean) # merge centroid locations into ggplot dataframe gg <- merge(gg,centroids,by="cluster",suffixes=c("",".centroid")) # generate star plot... ggplot(gg) + geom_point(aes(x=x,y=y,color=cluster), size=3) + geom_point(data=centroids, aes(x=x, y=y, color=cluster), size=4) + geom_segment(aes(x=x.centroid, y=y.centroid, xend=x, yend=y, color=cluster)) 

The result is identical to the result obtained using s.class(...) .

+6
source

All Articles