Faster ways to calculate frequencies and cast from long to wide

I am trying to get the amount of each combination of levels of two variables, "week" and "ID". I would like the result to have "id" as rows and "week" as columns, and count as values.

An example of what I have tried so far (I tried a bunch of other things, including adding a dummy variable = 1 and then fun.aggregate = sumthis):

library(plyr)
ddply(data, .(id), dcast, id ~ week, value_var = "id", 
        fun.aggregate = length, fill = 0, .parallel = TRUE)

However, I have to do something wrong, because this function does not end there. Is there a better way to do this?

Input data:

id      week
1       1
1       2
1       3
1       1
2       3

Output:

  1  2  3
1 2  1  1
2 0  0  1
+6
source share
4 answers

ddply. dcast reshape2 :

dat <- data.frame(
    id = c(rep(1, 4), 2),
    week = c(1:3, 1, 3)
)

library(reshape2)
dcast(dat, id~week, fun.aggregate=length)

  id 1 2 3
1  1 2 1 1
2  2 0 0 1

: R- ( table - ) xtabs:

xtabs(~id+week, data=dat)

   week
id  1 2 3
  1 2 1 1
  2 0 0 1
+11

table:

table(data$id,data$week)

    1 2 3
  1 2 1 1
  2 0 0 1

"id" "week" , :

table(data)
#    week
# id  1 2 3
#   1 2 1 1
#   2 0 0 1
+17

, ddply , , ( ""), ( .parallel = T) ,

data.table::dcast (data.table version> = 1.9.2) . :

library(data.table) 
dcast(setDT(data), id ~ week)
# Using 'week' as value column. Use 'value.var' to override
# Aggregate function missing, defaulting to 'length'
#    id 1 2 3
# 1:  1 2 1 1
# 2:  2 0 0 1

:

dcast(setDT(data), id ~ week, value.var = "week", fun = length)
#    id 1 2 3
# 1:  1 2 1 1
# 2:  2 0 0 1

pre- data.table 1.9.2 . .

+8

tidyverse :

library(tidyverse)

df %>%
  count(id, week) %>%
  spread(week, n, fill = 0)

#     id   '1'   '2'   '3'
#   <dbl> <dbl> <dbl> <dbl>
#1     1     2     1     1
#2     2     0     0     1

,

df %>%
  group_by(id, week) %>% #OR group_by_all()
  summarise(count = n()) %>%
  spread(week, count, fill = 0)
0

All Articles