Use two columns of value in the spread () function in R

I recently posted a question question on how to change data from a long table to a wide table. Then I found that spread() is a pretty convenient function for this. So, now I need further development in my previous post.

Suppose we have a table like this:

 id1 | id2 | info | action_time | action_comment | 1 | a | info1 | time1 | comment1 | 1 | a | info1 | time2 | comment2 | 1 | a | info1 | time3 | comment3 | 2 | b | info2 | time4 | comment4 | 2 | b | info2 | time5 | comment5 | 

And I would like to change it to something like this:

 id1 | id2 | info |action_time 1|action_comment1 |action_time 2|action_comment2 |action_time 3|action_comment3 | 1 | a | info1 | time1 | comment1 | time2 | comment2 | time3 | comment3 | 2 | b | info2 | time4 | comment4 | time5 | comment5 | | | 

So, the difference between this question and my previous question is that I added another column, and I need to change it as well.

I think to use

 library(dplyr) library(tidyr) df %>% group_by(id1) %>% mutate(action_no = paste("action_time", row_number())) %>% spread(action_no, value = c(action_time, action_comment)) 

But this gives me an error message when I insert two values ​​into the value argument saying: Invalid column specification.

I really like the idea of ​​using such a %>% operator to manage data, so I want to know how to fix my code for this to happen.

Thank you for help

+5
source share
3 answers

Try:

 library(dplyr) library(tidyr) df %>% group_by(id1) %>% mutate(id = row_number()) %>% gather(key, value, -(id1:info), -id) %>% unite(id_key, id, key) %>% spread(id_key, value) 

What gives:

 #Source: local data frame [2 x 9] # id1 id2 info 1_action_comment 1_action_time 2_action_comment 2_action_time 3_action_comment 3_action_time #1 1 a info1 comment1 time1 comment2 time2 comment3 time3 #2 2 b info2 comment4 time4 comment5 time5 NA NA 
+6
source

We could do this with a version of devel data.table , which can accept multiple value.var columns. Installation instructions for the devel version: here

We convert 'data.frame' to 'data.table' ( setDT(df) ), create a sequence variable ('ind') using grouping variables ('id1', 'id2', 'info'), and dcast from ' long 'to' wide ', specifying value.var as "action_time" and "action_comment".

 library(data.table)#v1.9.5+ setDT(df)[, ind:= 1:.N, .(id1, id2, info)] dcast(df, id1 + id2 + info ~ ind, value.var=c('action_time', 'action_comment'), fill='') # id1 id2 info 1_action_time 2_action_time 3_action_time 1_action_comment #1: 1 a info1 time1 time2 time3 comment1 #2: 2 b info2 time4 time5 comment4 # 2_action_comment 3_action_comment #1: comment2 comment3 #2: comment5 

Or use reshape from base R We create a sequence variable ('ind') with ave and reshape to change the format of 'long' to 'wide'.

 df$ind <- with(df, ave(seq_along(id1), id1, id2, info, FUN=seq_along)) reshape(df, idvar=c('id1', 'id2', 'info'),timevar='ind', direction='wide') # id1 id2 info action_time.1 action_comment.1 action_time.2 action_comment.2 #1 1 a info1 time1 comment1 time2 comment2 #4 2 b info2 time4 comment4 time5 comment5 # action_time.3 action_comment.3 #1 time3 comment3 #4 <NA> <NA> 

data

 df <- structure(list(id1 = c(1L, 1L, 1L, 2L, 2L), id2 = c("a", "a", "a", "b", "b"), info = c("info1", "info1", "info1", "info2", "info2"), action_time = c("time1", "time2", "time3", "time4", "time5"), action_comment = c("comment1", "comment2", "comment3", "comment4", "comment5")), .Names = c("id1", "id2", "info", "action_time", "action_comment"), class = "data.frame", row.names = c(NA, -5L)) 
+8
source

Not a direct solution, but it works

 library(tidyr) a = spread(df, action_comment, action_time); b = spread(df, action_time, action_comment); # dropping NAs and shifting the values to left row wise a[] = t(apply(a, 1, function(x) `length<-`(na.omit(x), length(x)))) b[] = t(apply(b, 1, function(x) `length<-`(na.omit(x), length(x)))) out = merge(a,b, by = c('id1','id2','info')) out[, colSums(is.na(out)) != nrow(out)] # id1 id2 info comment1 comment2 comment3 time1 time2 time3 #1 1 a info1 time1 time2 time3 comment1 comment2 comment3 #2 2 b info2 time4 time5 <NA> comment4 comment5 <NA> 
+2
source

All Articles