How to create a multiscreen file using ggplot?

I have some data formatted as follows:

2 2 2 1 2 1 2 1 2 1 2 1 2 2 2 1 2 1 2 1 2 2 2 2 2 1 2 1 2 2 2 2 2 1 2 1 2 1 2 1 2 1 2 1 2 1 3 1 3 1 3 1 3 3 3 2 3 2 4 4 4 2 4 4 4 2 4 4 4 2 4 2 4 4 4 2 4 2 4 1 4 1 4 2 4 3 4 1 4 3 6 1 6 1 6 2 7 1 7 1 7 1 7 1 7 1 8 2 8 2 8 2 8 2 8 2 8 2 12 1 12 1 12 1 12 1 12 1 

I am trying to build an ecdf this dataset for each individual value in the first column. Therefore, in this case, I want to draw 7 ecdf plots on the plot (one for all points having 2 in their first column, one for all points having 3 in their first column, etc.). For one column, I can build the ecdf file using the following:

 data = read.table("./test", header=F) data1 = data[data$V1 == 2,] qplot(unique(data1$V2), ecdf(data1$V2)(unique(data1$V2)), geom='step') 

But I can’t figure out how to build some curves. Any suggestions?

+4
source share
2 answers

Easier if you get away from qplot ():

 library(plyr) library(ggplot2) df <- data.frame( grp = as.factor( rep( c("A","B"), each=40 ) ) , val = c( sample(c(2:4,6:8,12),40,replace=TRUE), sample(1:4,40,replace=TRUE) ) ) df <- arrange(df,grp,val) dfecdf <- ddply(df, .(grp), transform, ecdf=ecdf(val)(val) ) p <- ggplot( dfecdf, aes(val, ecdf, colour = grp) ) p + geom_step() 

You can also easily add facet_wrap for multiple groups and xlab / ylab for labels.

multiple ecdfs

 df <- data.frame( grp = as.factor( rep( c("A","B"), each=120 ) ) , grp2 = as.factor( rep( c("cat","dog","elephant"), 40 ) ) , val = c( sample(c(2:4,6:8,12),120,replace=TRUE), sample(1:4,120,replace=TRUE) ) ) df <- arrange(df,grp,grp2,val) dfecdf <- ddply(df, .(grp,grp2), transform, ecdf=ecdf(val)(val) ) p <- ggplot( dfecdf, aes(val, ecdf, colour = grp) ) p + geom_step() + facet_wrap( ~grp2 ) 

using 2 grouping variables

+13
source

Since the end of 2012, ggplot2 has included a special feature for printing ecdfs: ggplot2 docs .

An example from this is even shorter than a good solution from Ari:

 df <- data.frame(x = c(rnorm(100, 0, 3), rnorm(100, 0, 10)), g = gl(2, 100)) ggplot(df, aes(x, colour = g)) + stat_ecdf() 

ecdf

+5
source

All Articles