Ggplot2: How is the curve of small Gaussian densities on the regression line?

I want to demonstrate the assumptions about linear (and later another) regression. How to add to my graph small Gaussian densities (or any types of densities) on the regression line, as in the figure:

enter image description here

+7
r plot regression ggplot2
source share
1 answer

You can calculate the empirical residue densities for cuts along a set line. Then it’s just a matter of drawing lines at your chosen positions in each interval using geom_path . To add a theoretical distribution, create some densities over the range of residuals for each section (here, using the normal density). For normal densities below, the standard deviation for each of them is determined for each section from the residuals, but you can simply select the standard deviation for all of them and use this instead.

 ## Sample data set.seed(0) dat <- data.frame(x=(x=runif(100, 0, 50)), y=rnorm(100, 10*x, 100)) ## breaks: where you want to compute densities breaks <- seq(0, max(dat$x), len=5) dat$section <- cut(dat$x, breaks) ## Get the residuals dat$res <- residuals(lm(y ~ x, data=dat)) ## Compute densities for each section, and flip the axes, and add means of sections ## Note: the densities need to be scaled in relation to the section size (2000 here) dens <- do.call(rbind, lapply(split(dat, dat$section), function(x) { d <- density(x$res, n=50) res <- data.frame(x=max(x$x)- d$y*2000, y=d$x+mean(x$y)) res <- res[order(res$y), ] ## Get some data for normal lines as well xs <- seq(min(x$res), max(x$res), len=50) res <- rbind(res, data.frame(y=xs + mean(x$y), x=max(x$x) - 2000*dnorm(xs, 0, sd(x$res)))) res$type <- rep(c("empirical", "normal"), each=50) res })) dens$section <- rep(levels(dat$section), each=100) ## Plot both empirical and theoretical ggplot(dat, aes(x, y)) + geom_point() + geom_smooth(method="lm", fill=NA, lwd=2) + geom_path(data=dens, aes(x, y, group=interaction(section,type), color=type), lwd=1.1) + theme_bw() + geom_vline(xintercept=breaks, lty=2) 

enter image description here

Or, only gaussian curves

 ## Just normal ggplot(dat, aes(x, y)) + geom_point() + geom_smooth(method="lm", fill=NA, lwd=2) + geom_path(data=dens[dens$type=="normal",], aes(x, y, group=section), color="salmon", lwd=1.1) + theme_bw() + geom_vline(xintercept=breaks, lty=2) 

enter image description here

+9
source share

All Articles