Filtering in the dplyr generic function

I struggle with dplyr because I want to do two things on one and think about whether this is possible.

I want to calculate the average value and at the same time the average value for values โ€‹โ€‹that have a specific value in another column.

 library(dplyr) set.seed(1234) df <- data.frame(id=rep(1:10, each=14), tp=letters[1:14], value_type=sample(LETTERS[1:3], 140, replace=TRUE), values=runif(140)) df %>% group_by(id, tp) %>% summarise( all_mean=mean(values), A_mean=mean(values), # Only the values with value_type A value_count=sum(value_type == 'A') ) 

Therefore, the column A_mean must calculate the average value, where value_count == 'A' .

Usually I make two separate commands and combine the results later, but I think there is a more convenient way, and I just donโ€™t understand.

Thanks in advance.

+5
source share
2 answers

We can try

  df %>% group_by(id, tp) %>% summarise(all_mean = mean(values), A_mean = mean(values[value_type=="A"]), value_count=sum(value_type == 'A')) 
+9
source

You can do this with two final steps:

 df %>% group_by(id, tp, value_type) %>% summarise(A_mean = mean(values)) %>% summarise(all_mean = mean(values), A_mean = sum(A_mean * (value_type == "A")), value_count = sum(value_type == "A")) 

In the first summary, the means per value_type are calculated, and in the second "sum" only the average value_type == "A"

0
source

All Articles