Identifier-based string concatenation in R

My data

Id|date1|date2   
1|2008-10-01|NA        
1|NA|2008-10-02     
1|NA|2008-10-03     
2|2008-10-02|NA
2|NA|2008-10-03

I want to get this way

Id|date1|date2|date3    
1|2008-10-01|2008-10-02|2008-10-03        
2|2008-10-02|2008-10-03 

I tried using aggregate and dcast, but they make the date in numeric format, and na still cannot be avoided.

+4
source share
2 answers

You can do this quite easily using data.table, although it will get complicated if the number of missing values ​​is not equal between the columns

library(data.table)
setDT(df)[, lapply(.SD, na.omit), by = Id]
#   Id      date1       date2
# 1:  1 2008-10-02 2008-10-02 
# 2:  2 2008-10-02 2008-10-02 
+4
source

Here's a similar idea using tidyr:

library(dplyr)
library(tidyr)

df %>%
  gather(key, value, -Id) %>% 
  na.omit() %>% 
  spread(key, value)

What gives:

#  Id      date1      date2
#1  1 2008-10-02 2008-10-02
#2  2 2008-10-02 2008-10-02
+2
source

All Articles