Ggplot2: How is the curve of small Gaussian densities on the regression line?

Question

Ggplot2: How is the curve of small Gaussian densities on the regression line?

I want to demonstrate the assumptions about linear (and later another) regression. How to add to my graph small Gaussian densities (or any types of densities) on the regression line, as in the figure:

+7

r plot regression ggplot2

Maju116 Aug 3 '15 at 19:04

source share

1 answer

jenesaisquoi · Accepted Answer · 2015-08-03T20:22:22+0000

You can calculate the empirical residue densities for cuts along a set line. Then it’s just a matter of drawing lines at your chosen positions in each interval using geom_path . To add a theoretical distribution, create some densities over the range of residuals for each section (here, using the normal density). For normal densities below, the standard deviation for each of them is determined for each section from the residuals, but you can simply select the standard deviation for all of them and use this instead.

 ## Sample data set.seed(0) dat <- data.frame(x=(x=runif(100, 0, 50)), y=rnorm(100, 10*x, 100)) ## breaks: where you want to compute densities breaks <- seq(0, max(dat$x), len=5) dat$section <- cut(dat$x, breaks) ## Get the residuals dat$res <- residuals(lm(y ~ x, data=dat)) ## Compute densities for each section, and flip the axes, and add means of sections ## Note: the densities need to be scaled in relation to the section size (2000 here) dens <- do.call(rbind, lapply(split(dat, dat$section), function(x) { d <- density(x$res, n=50) res <- data.frame(x=max(x$x)- d$y*2000, y=d$x+mean(x$y)) res <- res[order(res$y), ] ## Get some data for normal lines as well xs <- seq(min(x$res), max(x$res), len=50) res <- rbind(res, data.frame(y=xs + mean(x$y), x=max(x$x) - 2000*dnorm(xs, 0, sd(x$res)))) res$type <- rep(c("empirical", "normal"), each=50) res })) dens$section <- rep(levels(dat$section), each=100) ## Plot both empirical and theoretical ggplot(dat, aes(x, y)) + geom_point() + geom_smooth(method="lm", fill=NA, lwd=2) + geom_path(data=dens, aes(x, y, group=interaction(section,type), color=type), lwd=1.1) + theme_bw() + geom_vline(xintercept=breaks, lty=2)

Or, only gaussian curves

 ## Just normal ggplot(dat, aes(x, y)) + geom_point() + geom_smooth(method="lm", fill=NA, lwd=2) + geom_path(data=dens[dens$type=="normal",], aes(x, y, group=section), color="salmon", lwd=1.1) + theme_bw() + geom_vline(xintercept=breaks, lty=2)

Ggplot2: How is the curve of small Gaussian densities on the regression line?

More articles: