How does ggplot2 density differ from density?

Why do the following graphs look different? Both methods seem to use Gaussian kernels.

How ggplot2 calculate density ggplot2 ?

 library(fueleconomy) d <- density(vehicles$cty, n=2000) ggplot(NULL, aes(x=d$x, y=d$y)) + geom_line() + scale_x_log10() 

enter image description here

 ggplot(vehicles, aes(x=cty)) + geom_density() + scale_x_log10() 

enter image description here


UPDATE:

The solution to this issue already appears on SO here , however, the specific parameters of ggplot2 go over to the density function of statistics R, remain unclear.

An alternative solution is to extract the density data directly from the ggplot2 plot, as shown here

+6
source share
1 answer

In this case, this is not a density calculation, but the other one uses the log10 transformation.

First check the density with similar ones without conversion

 library(ggplot2) library(fueleconomy) d <- density(vehicles$cty, from=min(vehicles$cty), to=max(vehicles$cty)) ggplot(data.frame(x=d$x, y=d$y), aes(x=x, y=y)) + geom_line() ggplot(vehicles, aes(x=cty)) + stat_density(geom="line") 

So the problem seems to be a conversion. In stat_density below, it seems if the log10 transformation is applied to the variable x before calculating the density. Thus, in order to reproduce the results manually, you must convert the variable before calculating the density. For instance,

 d2 <- density(log10(vehicles$cty), from=min(log10(vehicles$cty)), to=max(log10(vehicles$cty))) ggplot(data.frame(x=d2$x, y=d2$y), aes(x=x, y=y)) + geom_line() ggplot(vehicles, aes(x=cty)) + stat_density(geom="line") + scale_x_log10() 

PS: To see how ggplot prepares data for density, you can look at the code as.list(StatDensity) , which leads to StatDensity$compute_group - ggplot2:::compute_density

+3
source

All Articles