I need to summarize the line counts that I assign to groups, and I know that I can do this in dplyr / tidyr, but I am missing something.
Dataset example:
Owner = c('bob','julia','cheryl','bob','julia','cheryl') Day = c('Mon', 'Tue') Locn = c('house','store','apartment','office','house','shop') data <- data.frame(Owner, Day, Locn)
which is as follows:
Owner Day Locn 1 bob Mon house 2 julia Tue store 3 cheryl Mon apartment 4 bob Tue office 5 julia Mon house 6 cheryl Tue shop
I want to group by name and day, and then count grouped locations in columns. In this example, I want the “home” and “apartment” to add the “Home” and “shop”, “office” and “shop” columns to be counted in the “Work” column.
My current code (which does not work):
grouped_locn <- data %>% dplyr::arrange(Owner, Day) %>% dplyr::group_by(Owner, Day) %>% dplyr::summarize(Home = which(data$Locn %in% c('house', 'apartment')), Work = which(data$Locn %in% c("store", "office", "apartment")))
I included only my current attempt at the summation stage to show how I was approaching it. The Home and Work code currently returns line number vectors that contain a group element (for example, Home = 1 3 5)
My intended output:
Owner Day Home Work 1 bob Mon 1 0 2 bob Tue 0 1 3 julia Mon 1 0 4 julia Tue 0 1 5 cheryl Mon 1 0 6 cheryl Tue 0 1
In the actual dataset (30k + rows) there are several Locn values for each owner per day, so counting Home and Work can be a number other than 1 and 0 (so there are no logical values).
Thank you very much.