Grouped barcode in R with errors

Dear Stackoverflow Users,

I would like to draw a grouped barcode with errors. Here is a figure that I was able to get to the present moment, and this is normal for what I need:

enter image description here

And here is my script:

#create dataframe Gene<-c("Gene1","Gene2","Gene1","Gene2") count1<-c(12,14,16,34) count2<-c(4,7,9,23) count3<-c(36,22,54,12) count4<-c(12,24,35,23) Species<-c("A","A","B","B") df<-data.frame(Gene,count1,count2,count3,count4,Species) df mean1<-mean(as.numeric(df[1,][c(2,3,4,5)])) mean2<-mean(as.numeric(df[2,][c(2,3,4,5)])) mean3<-mean(as.numeric(df[3,][c(2,3,4,5)])) mean4<-mean(as.numeric(df[4,][c(2,3,4,5)])) Gene1SpeciesA.stdev<-sd(as.numeric(df[1,][c(2,3,4,5)])) Gene2SpeciesA.stdev<-sd(as.numeric(df[2,][c(2,3,4,5)])) Gene1SpeciesB.stdev<-sd(as.numeric(df[3,][c(2,3,4,5)])) Gene2SpeciesB.stdev<-sd(as.numeric(df[4,][c(2,3,4,5)])) ToPlot<-c(mean1,mean2,mean3,mean4) #plot barplot plot<-matrix(ToPlot,2,2,byrow=TRUE) #with 2 being replaced by the number of genes! tplot<-t(plot) BarPlot <- barplot(tplot, beside=TRUE,ylab="count", names.arg=c("Gene1","Gene2"),col=c("blue","red")) #add legend legend("topright", legend = c("SpeciesA","SpeciesB"), fill = c("blue","red")) #add error bars ee<-matrix(c(Gene1SpeciesA.stdev,Gene2SpeciesA.stdev,Gene1SpeciesB.stdev,Gene2SpeciesB.stdev),2,2,byrow=TRUE)*1.96/sqrt(4) tee<-t(ee) error.bar(BarPlot,tplot,tee) 

The problem is that I need to do this for 50 genes and 4 species, so my script will get super super long, and I think it is not optimized ... I tried to find help here , but I can’t figure out how to make it better what i would like. If I don't need error bars, I could adapt this script , but the hard part is to mix beautiful ggplot barplots and error bars !;)

If you have an idea to optimize my script, I would really appreciate it! :)

Thanks a lot!

+5
source share
1 answer

Starting with your df definition, you can do this in a few lines:

 library(ggplot2) cols = c(2,3,4,5) df1 = transform(df, mean=rowMeans(df[cols]), sd=apply(df[cols],1, sd)) # df1 looks like this # Gene count1 count2 count3 count4 Species mean sd #1 Gene1 12 4 36 12 A 16.00 13.856406 #2 Gene2 14 7 22 24 A 16.75 7.804913 #3 Gene1 16 9 54 35 B 28.50 20.240224 #4 Gene2 34 23 12 23 B 23.00 8.981462 ggplot(df1, aes(x=as.factor(Gene), y=mean, fill=Species)) + geom_bar(position=position_dodge(), stat="identity", colour='black') + geom_errorbar(aes(ymin=mean-sd, ymax=mean+sd), width=.2,position=position_dodge(.9)) 

enter image description here

+8
source

All Articles