Concatenate strings that have the same value in a variable in R

Question

Concatenate strings that have the same value in a variable in R

I created the following data frame in R:

V1 <- c(1,3,2,6,7,7,5,3,1,1) V2 <- c("rot", "grün", "grün", "gelb", "blau", "rot", "grün", "blau", "blau", "schwarz") V3 <- c(44,23,28,23,88,88,44,28,11,44) as.data.frame(cbind(V1,V2,V3) ) V1 V2 V3 1 1 rot 44 2 3 grün 23 3 2 grün 28 4 6 gelb 23 5 7 blau 88 6 7 rot 88 7 5 grün 44 8 3 blau 28 9 1 blau 11 10 1 schwarz 44

V3 is the variable that I want to use to reorder the dataset. The result should be a data frame that contains a row for each V3 value, as well as information about other variables in the same row.

In this example, I want something like this:

 V3 V1.1 V2.1 V2.1 V2.2 V1.3 V2.3 11 1 blau NA NA NA NA 23 3 grün 6 gelb NA NA 28 2 grün 3 blau NA NA 44 1 rot 5 grün 1 schwarz 88 7 blau 7 rot NA NA

Is there a function that can do this? Thanks for your help!!!!

+4

r

Sandy Aug 11 '15 at 17:39

source share

2 answers

Here is one option: dcast from the devel data.table version.

Convert data.frame to data.table ( setDT(df1) ). Create an “indx” sequence column based on the “V3” grouping variable and dcast “long” to “wide”. In the devel version, dcast can accept multiple value.var columns.

 library(data.table)#v1.9.5+ setDT(df1)[, indx:=1:.N, V3]#create sequence variable dcast(df1, V3~indx, value.var=c('V1', 'V2'), sep=".") # V3 V1.1 V1.2 V1.3 V2.1 V2.2 V2.3 #1: 11 1 NA NA blau NA NA #2: 23 3 6 NA grün gelb NA #3: 28 2 3 NA grün blau NA #4: 44 1 5 1 rot grün schwarz #5: 88 7 7 NA blau rot NA

NOTE. Installation instructions for the devel version: here

This can be done in a more compact way, using the getanID from splitstackshape to create a sequence variable.

  library(splitstackshape) dcast(getanID(df1, 'V3'), V3~.id, value.var=c('V1', 'V2')) # V3 V1_1 V1_2 V1_3 V2_1 V2_2 V2_3 #1: 11 1 NA NA blau NA NA #2: 23 3 6 NA grün gelb NA #3: 28 2 3 NA grün blau NA #4: 44 1 5 1 rot grün schwarz #5: 88 7 7 NA blau rot NA

data

  df1 <- data.frame(V1, V2, V3)

+2

akrun Aug 11 '15 at 19:16

source share

bgoldst · Accepted Answer · 2015-08-11T17:58:25+0000

 reshape(transform(df,time=ave(seq_len(nrow(df)),V3,FUN=seq_along)),dir='w',idvar='V3'); ## V3 V1.1 V2.1 V1.2 V2.2 V1.3 V2.3 ## 1 44 1 rot 5 grün 1 schwarz ## 2 23 3 grün 6 gelb <NA> <NA> ## 3 28 2 grün 3 blau <NA> <NA> ## 5 88 7 blau 7 rot <NA> <NA> ## 9 11 1 blau <NA> <NA> <NA> <NA>

Concatenate strings that have the same value in a variable in R

data

More articles: