Build funds at confidence intervals with ggplot

I have some data that I collected from the model. I want to talk about the population over time. I have a population size at each time step, and 100 repetitions. I would like to calculate the average population size for each time step, as well as 95% confidence intervals (in case of shading, if possible).

I have not used ggplot . I just used regular (basic) graphics in R so far. But I want to see what ggplot will look ggplot .

Here is what I still have:

 ggplot(data=model1, aes(x=steps., y= pop-size, col='blue')) + geom_line() 

This displays all the dots and it looks good, but I don’t know how easy it is to build funds and add confidence intervals.

+5
source share
1 answer

Since you have replicated data and you want to display the average / CL value, you are probably better off using stat_summary(...) , which is designed to (you guessed it) summarize the data. Basically, it applies the function to all y values ​​for each x-value (for example, for the mean(...) function), and then prints the result using any geometry you specify. Here is an example:

 # sample data - should be provided in question set.seed(1) # for reproducible example time <- 1:25 df <- data.frame(time, pop=rnorm(100*length(time), mean=10*time/(25+time))) library(ggplot2) ggplot(df, aes(x=time, y=pop))+ stat_summary(geom="ribbon", fun.data=mean_cl_normal, width=0.1, conf.int=0.95, fill="lightblue")+ stat_summary(geom="line", fun.y=mean, linetype="dashed")+ stat_summary(geom="point", fun.y=mean, color="red") 

So, we have 3 layers: a layer that summarizes y values ​​using the mean(...) function, and graphs using geom="line" , a layer that sums up the same thing, but graphs using geom="point" and a layer that uses geom="ribbon" This geometry requires the aesthetics of ymax and ymax , so we use the built-in ggplot mean_cl_normal function to generate those that are based on the assumption that the error is usually distributed, and therefore the funds follow the student distribution. Enter ?hmisc for documentation of various features that are useful for trust restrictions. Layers are rendered in code order, so since you want the shading, we need to put the ribbon in error first.

Finally, you can, of course, compile the data using dplyr or some of them, but I really don't see the point.

Refresh (based on last comment): It seems that the most recent version of ggplot2 (2.0.0) has a different way of specifying fun.data arguments. This works in the new version:

 ggplot(df, aes(x=time, y=pop))+ stat_summary(geom="ribbon", fun.data=mean_cl_normal, fun.args=list(conf.int=0.95), fill="lightblue")+ stat_summary(geom="line", fun.y=mean, linetype="dashed")+ stat_summary(geom="point", fun.y=mean, color="red") 

The problem with the argument width=... little more subtle, I think: it is not really needed (in the original answer I used error lines and forgot to delete this argument when I changed it to tape). In the older version of ggplot2, extraneous arguments are ignored (therefore, there are no errors). The new version seems to be more strict. This is probably better.

+14
source

All Articles