How does ggplot2 density differ from density?

Question

How does ggplot2 density differ from density?

Why do the following graphs look different? Both methods seem to use Gaussian kernels.

How ggplot2 calculate density ggplot2 ?

 library(fueleconomy) d <- density(vehicles$cty, n=2000) ggplot(NULL, aes(x=d$x, y=d$y)) + geom_line() + scale_x_log10()

 ggplot(vehicles, aes(x=cty)) + geom_density() + scale_x_log10()

UPDATE:

The solution to this issue already appears on SO here , however, the specific parameters of ggplot2 go over to the density function of statistics R, remain unclear.

An alternative solution is to extract the density data directly from the ggplot2 plot, as shown here

+6

r ggplot2 density-plot

Megatron Apr 21 '16 at 10:53

source share

1 answer

user20650 · Accepted Answer · 2016-04-22T02:18:48+0000

In this case, this is not a density calculation, but the other one uses the log10 transformation.

First check the density with similar ones without conversion

 library(ggplot2) library(fueleconomy) d <- density(vehicles$cty, from=min(vehicles$cty), to=max(vehicles$cty)) ggplot(data.frame(x=d$x, y=d$y), aes(x=x, y=y)) + geom_line() ggplot(vehicles, aes(x=cty)) + stat_density(geom="line")

So the problem seems to be a conversion. In stat_density below, it seems if the log10 transformation is applied to the variable x before calculating the density. Thus, in order to reproduce the results manually, you must convert the variable before calculating the density. For instance,

 d2 <- density(log10(vehicles$cty), from=min(log10(vehicles$cty)), to=max(log10(vehicles$cty))) ggplot(data.frame(x=d2$x, y=d2$y), aes(x=x, y=y)) + geom_line() ggplot(vehicles, aes(x=cty)) + stat_density(geom="line") + scale_x_log10()

PS: To see how ggplot prepares data for density, you can look at the code as.list(StatDensity) , which leads to StatDensity$compute_group - ggplot2:::compute_density

How does ggplot2 density differ from density?

More articles: