Adding labels to curves in glmnet graphics in R

I use the glmnet package to get the following graph from the mtcars dataset (mpg regression for other variables):

library(glmnet) fit = glmnet(as.matrix(mtcars[-1]), mtcars[,1]) plot(fit, xvar='lambda') 

enter image description here

How can I add variable names to each curve, either at the beginning of each curve, or at its maximum point y (as far away from the x axis)? I tried, and I can add a legend, as usual, but not the inscription on each curve or at the beginning. Thank you for your help.

(PS: If you find this question interesting / important, please vote for it;)

+7
source share
3 answers

Since shortcuts are hardcoded, it might be easier to write a quick function. This is just a quick shot, so you can change it to be more thorough. I would also like to note that when using lasso there are usually many variables, so there will be a lot of overlapping labels (as you can see in your small example)

 lbs_fun <- function(fit, ...) { L <- length(fit$lambda) x <- log(fit$lambda[L]) y <- fit$beta[, L] labs <- names(y) text(x, y, labels=labs, ...) } # plot plot(fit, xvar="lambda") # label lbs_fun(fit) 

enter image description here

+3
source

An alternative is the plot_glmnet function in the plotmo package. It automatically positions variable names and has several other bells and whistles. For example, the following code

 library(glmnet) mod <- glmnet(as.matrix(mtcars[-1]), mtcars[,1]) library(plotmo) # for plot_glmnet plot_glmnet(mod) 

gives

plot

Variable names are allocated to prevent overwriting, but we can still see which curve is associated with which variable. Further examples can be found in chapter 6 in the plot vignette which is included in the plotmo package.

+3
source

The following is a modification of the best answer, using line segments instead of text labels directly above the curves. This is especially useful when there are many variables, and you want to print only those values ​​that have absolute coefficient values ​​greater than zero:

 #note: the argument 'lra' is a cv.glmnet object lbs_fun <- function(lra, ...) { fit <- lra$glmnet.fit L=which(fit$lambda==lra$lambda.min) ystart <- sort(fit$beta[abs(fit$beta[,L])>0,L]) labs <- names(ystart) r <- range(fit$beta[,100]) # max gap between biggest and smallest coefs at smallest lambda ie, 100th lambda yfin <- seq(r[1],r[2],length=length(ystart)) xstart<- log(lra$lambda.min) xfin <- xstart+1 text(xfin+0.3,yfin,labels=labs,...) segments(xstart,ystart,xfin,yfin) } plot(lra$glmnet.fit,label=F, xvar="lambda", xlim=c(-5.2,0), lwd=2) #xlim, lwd is optional 
0
source

All Articles