Average plot value and sd dataset per x value using ggplot2

I have a dataset that looks something like this:

a <- data.frame(x=rep(c(1,2,3,5,7,10,15,20), 5), y=rnorm(40, sd=2) + rep(c(4,3.5,3,2.5,2,1.5,1,0.5), 5)) ggplot(a, aes(x=x,y=y)) + geom_point() +geom_smooth() 

graph output

I want to get the same result as this graph, but instead of a smooth curve, I just want to take the line segments between the / sd averages for each set of x values. The graph should be similar to the graph above, but notched, not curved.

I tried this, but it fails, although the x values ​​are not unique:

 ggplot(a, aes(x=x,y=y)) + geom_point() +stat_smooth(aes(group=x, y=y, x=x)) geom_smooth: Only one unique x value each group.Maybe you want aes(group = 1)? 
+4
source share
4 answers

You can try writing the summary function proposed by Hadley Wickham on the website for ggplot2 : http://had.co.nz/ggplot2/stat_summary.html . Applying his sentence to your code:

 p <- qplot(x, y, data=a) stat_sum_df <- function(fun, geom="crossbar", ...) { stat_summary(fun.data=fun, colour="blue", geom=geom, width=0.2, ...) } p + stat_sum_df("mean_cl_normal", geom = "smooth") 

As a result of this figure:

enter image description here

+3
source

?stat_summary is what you should be looking at.

Here is an example

 # functions to calculate the upper and lower CI bounds uci <- function(y,.alpha){mean(y) + qnorm(abs(.alpha)/2) * sd(y)} lci <- function(y,.alpha){mean(y) - qnorm(abs(.alpha)/2) * sd(y)} ggplot(a, aes(x=x,y=y)) + stat_summary(fun.y = mean, geom = 'line', colour = 'blue') + stat_summary(fun.y = mean, geom = 'ribbon',fun.ymax = uci, fun.ymin = lci, .alpha = 0.05, alpha = 0.5) 

enter image description here

+7
source

Using ggplot2 0.9.3.1, the following trick for me:

 ggplot(a, aes(x=x,y=y)) + geom_point() + stat_summary(fun.data = 'mean_sdl', mult = 1, geom = 'smooth') 

"mean_sdl" is the implementation of the function of the Hmisc package "smean.sdl", and the multi-variable gives the number of standard deviations (above and below the average).

Details of the original function:

 library('Hmisc') ?smean.sdl 
+3
source

You can use one of the mean_sdl built-in summary functions. The code is shown below.

 ggplot(a, aes(x=x,y=y)) + stat_summary(fun.y = 'mean', colour = 'blue', geom = 'line') stat_summary(fun.data = 'mean_sdl', geom = 'ribbon', alpha = 0.2) 
+2
source

All Articles