Create a date sequence for each group in r

I have a dataset that looks like this:

ID created_at MUM-0001 2014-04-16 MUM-0002 2014-01-14 MUM-0003 2014-04-17 MUM-0004 2014-04-12 MUM-0005 2014-04-18 MUM-0006 2014-04-17 

I am trying to introduce a new column that will be all the dates between the start date and the definition of the last day (say, 12th-July-2015). I used the seq function in dplyr , but I get an error.

 data1 <- data1 %>% arrange(ID) %>% group_by(ID) %>% mutate(date = seq(as.Date(created_at), as.Date('2015-07-12'), by= 1)) 
The error I am getting is:

Error: incompatible size (453) expecting 1 (group size) or 1

Can you suggest a better way to accomplish this task in R?

+4
source share
1 answer

You can use data.table to get the sequence from dates from 'created_at' to '2015-07-12', grouped by column "ID".

  library(data.table) setDT(df1)[, list(date=seq(created_at, as.Date('2015-07-12'), by='1 day')) , ID] 

If you need an option with dplyr , use do

  library(dplyr) df1 %>% group_by(ID) %>% do( data.frame(., Date= seq(.$created_at, as.Date('2015-07-12'), by = '1 day'))) 

If you have duplicate identifiers, we might need the row_number() group

 df1 %>% group_by(rn=row_number()) %>% do(data.frame(ID= .$ID, Date= seq(.$created_at, as.Date('2015-07-12'), by = '1 day'), stringsAsFactors=FALSE)) 

Update

Based on @Frank commment, the new idiom for tidyverse is

 library(tidyverse) df1 %>% group_by(ID) %>% mutate(d = list(seq(created_at, as.Date('2015-07-12'), by='1 day')), created_at = NULL) %>% unnest() 

In case of data.table

 setDT(df1)[, list(date=seq(created_at, as.Date('2015-07-12'), by = '1 day')), by = 1:nrow(df1)] 

data

 df1 <- structure(list(ID = c("MUM-0001", "MUM-0002", "MUM-0003", "MUM-0004", "MUM-0005", "MUM-0006"), created_at = structure(c(16176, 16084, 16177, 16172, 16178, 16177), class = "Date")), .Names = c("ID", "created_at"), row.names = c(NA, -6L), class = "data.frame") 
+6
source

All Articles