Summary Statistics

For the next data set

Genre   Amount
Comedy  10
Drama   30
Comedy  20
Action  20
Comedy  20
Drama   20

I want to build a linear graph ggplot2, where the x axis Genre, and the y axis is the sum of all the sums (conditional on Genre).

I tried the following:

p = ggplot(test, aes(factor(Genre), Gross)) + geom_point()
p = ggplot(test, aes(factor(Genre), Gross)) + geom_line()
p = ggplot(test, aes(factor(Genre), sum(Gross))) + geom_line()

but to no avail.

+5
source share
2 answers

If you do not want to calculate a new data frame before plotting, you can use stat_summaryin ggplot2. For example, if your dataset looks like this:

R> df <- data.frame(Genre=c("Comedy","Drama","Action","Comedy","Drama"),
R+                  Amount=c(10,30,40,10,20))
R> df
   Genre Amount
1 Comedy     10
2  Drama     30
3 Action     40
4 Comedy     10
5  Drama     20

You can use either qplotwith an argument stat="summary":

R> qplot(Genre, Amount, data=df, stat="summary", fun.y="sum")

Or add stat_summaryto the base chart ggplot:

R> ggplot(df, aes(x=Genre, y=Amount)) + stat_summary(fun.y="sum", geom="point")
+8
source

Try something like this:

dtf <- structure(list(Genre = structure(c(2L, 3L, 2L, 1L, 2L, 3L), .Label = c("Action", 
"Comedy", "Drama"), class = "factor"), Amount = c(10, 30, 20, 
20, 20, 20)), .Names = c("Genre", "Amount"), row.names = c(NA, 
-6L), class = "data.frame")

library(reshape)
library(ggplot2)
mdtf <- melt(dtf)
cdtf <- cast(mdtf, Genre ~ . , sum)
ggplot(cdtf, aes(Genre, `(all)`)) + geom_bar()
+1
source

All Articles