R - how to make PCA biplot more readable

I have a set of observations with 23 variables.

When I use prcomp and biplot to build the results, I encounter several problems:

  • the actual graph takes only half the frame (x <0), but the graph is centered on 0, so half the space is lost

  • two variables clearly dominate the results, so all the other arrows are grouped together, and I can’t read the thing

ad 1. I tried to install xlim and / or ylim, but I'm obviously doing something wrong, because the plot is all messed up when I do

ad 2. Can I somehow make the shortcuts of the arrows placed more apart so that I can read them? Or maybe I could just build the arrows without the two longest (kind of increase)?

My pca plot

Addendum: is it possible for the biplot to draw labels of a different color than the arrows?

Also: problematic if the x and y axes are not proportional (the graph shows intervals of different lengths along x and y). I think this would distort the angels between the arrows, and such a resizing is not a similarity transformation. Is it possible to force a biplotter to maintain 1: 1 proportions or draw a graph as a rectangle rather than a square?

+8
r plot pca
source share
1 answer

I think you can use xlim and ylim . Also, look at the expand argument for ?biplot . Unfortunately, you did not provide any data, so let's take some sample data:

 a <- princomp(USArrests) 

Below is the result of just calling biplot :

 biplot(a) 

enter image description here

And now you can “zoom in” to take a closer look at “Killing” and “Rape” with xlim and ylim , and also use the expand argument of expand from ?biplot :

 biplot(a, expand=10, xlim=c(-0.30, 0.0), ylim=c(-0.1, 0.1)) 

enter image description here

Note the different scaling on the top and right axis due to the expand factor.

Does this make your story male readable?

EDIT

You also asked if it is possible to have different colors for labels and arrows. biplot does not support this, what you can do is copy the code stats:::biplot.default and then change it according to your needs (change the col argument when plot , axis and text ),

Alternatively, you can use ggplot for a biplot. The message here implements a simple biplot function. You can change the code as follows:

 PCbiplot <- function(PC, x="PC1", y="PC2", colors=c('black', 'black', 'red', 'red')) { # PC being a prcomp object data <- data.frame(obsnames=row.names(PC$x), PC$x) plot <- ggplot(data, aes_string(x=x, y=y)) + geom_text(alpha=.4, size=3, aes(label=obsnames), color=colors[1]) plot <- plot + geom_hline(aes(0), size=.2) + geom_vline(aes(0), size=.2, color=colors[2]) datapc <- data.frame(varnames=rownames(PC$rotation), PC$rotation) mult <- min( (max(data[,y]) - min(data[,y])/(max(datapc[,y])-min(datapc[,y]))), (max(data[,x]) - min(data[,x])/(max(datapc[,x])-min(datapc[,x]))) ) datapc <- transform(datapc, v1 = .7 * mult * (get(x)), v2 = .7 * mult * (get(y)) ) plot <- plot + coord_equal() + geom_text(data=datapc, aes(x=v1, y=v2, label=varnames), size = 5, vjust=1, color=colors[3]) plot <- plot + geom_segment(data=datapc, aes(x=0, y=0, xend=v1, yend=v2), arrow=arrow(length=unit(0.2,"cm")), alpha=0.75, color=colors[4]) plot } 

The plot is as follows:

 fit <- prcomp(USArrests, scale=T) PCbiplot(fit, colors=c("black", "black", "red", "yellow")) 

enter image description here

If you play around a bit with this function, I'm sure you can figure out how to set the values ​​of xlim and ylim , etc.

+19
source share

All Articles