Filtering in the dplyr generic function

Question

Filtering in the dplyr generic function

I struggle with dplyr because I want to do two things on one and think about whether this is possible.

I want to calculate the average value and at the same time the average value for values that have a specific value in another column.

 library(dplyr) set.seed(1234) df <- data.frame(id=rep(1:10, each=14), tp=letters[1:14], value_type=sample(LETTERS[1:3], 140, replace=TRUE), values=runif(140)) df %>% group_by(id, tp) %>% summarise( all_mean=mean(values), A_mean=mean(values), # Only the values with value_type A value_count=sum(value_type == 'A') )

Therefore, the column A_mean must calculate the average value, where value_count == 'A' .

Usually I make two separate commands and combine the results later, but I think there is a more convenient way, and I just don’t understand.

Thanks in advance.

+5

r dplyr

drmariod Jun 29 '16 at 8:30

source share

2 answers

You can do this with two final steps:

 df %>% group_by(id, tp, value_type) %>% summarise(A_mean = mean(values)) %>% summarise(all_mean = mean(values), A_mean = sum(A_mean * (value_type == "A")), value_count = sum(value_type == "A"))

In the first summary, the means per value_type are calculated, and in the second "sum" only the average value_type == "A"

0

Alexr Jun 29 '16 at 8:37

source share

akrun · Accepted Answer · 2016-06-29T08:36:14+0000

We can try

  df %>% group_by(id, tp) %>% summarise(all_mean = mean(values), A_mean = mean(values[value_type=="A"]), value_count=sum(value_type == 'A'))

Filtering in the dplyr generic function

More articles: