I want to build a histogram summarizing a variable along two dimensions, one will be distributed along xand the other will be distributed vertically (folded).
I would expect the following two instructions to do the same, but they do not, and only the second gives the desired result (where I collect the data myself).
I would like to understand what happens in the first case, and if there is a way to use the ggplot2built-in aggregation functions to get the correct output.
library(ggplot2)
library(dplyr)
p1 <- ggplot(diamonds,aes(cut,price,fill=color)) +
geom_bar(stat="sum",na.rm=TRUE)
giving this graph:

p2 <- ggplot(diamonds %>%
group_by(cut,color) %>%
summarize_at("price",sum,na.rm=T),
aes(cut,price,fill=color)) +
geom_bar(stat="identity",na.rm=TRUE)
giving this image:

Here, where the top of our bars should be, p1 does not give these values:
diamonds %>% group_by(cut) %>% summarize_at("price",sum,na.rm=TRUE)
# # A tibble: 5 x 2
# cut price
# <ord> <int>
# 1 Fair 7017600
# 2 Good 19275009
# 3 Very Good 48107623
# 4 Premium 63221498
# 5 Ideal 74513487
source
share