Tidyr uses separate_rows across multiple columns

I have a data.frame where some cells contain rows of separate comma values:

d <- data.frame(a=c(1:3), b=c("name1, name2, name3", "name4", "name5, name6"), c=c("name7","name8, name9", "name10" )) 

I want to separate these lines where each name is split into its own cell. It is easy with

 tidyr::separate_rows(d, b, sep=",") 

if it is done for one column at a time. But I can not do this for both columns "b" and "c" at the same time, as this requires that the number of names in each row be the same. Instead of writing

 tidyr::separate_rows(d, b, sep=",") tidyr::separate_rows(d, c, sep=",") 

Is there a way to do this in one layer, for example. using? Something like

 apply(d, 2, separate_rows(...)) 

Not sure how to pass arguments to separate_rows() .

+7
r apply tidyr
source share
2 answers

You can use the handset. Note that sep = ", " automatically detected.

 d %>% separate_rows(b) %>% separate_rows(c) # abc # 1 1 name1 name7 # 2 1 name2 name7 # 3 1 name3 name7 # 4 2 name4 name8 # 5 2 name4 name9 # 6 3 name5 name10 # 7 3 name6 name10 

Note: Using tidyr version 0.6.0, where the %>% operator is included in the package.


Update: Using the @doscendodiscimus comment, we could use a for() loop and reassign d at each iteration. So we can have as many columns as we like. We will use a symbolic vector of column names, so we need to move to the standard evaluation version of separate_rows_ .

 cols <- c("b", "c") for(col in cols) { d <- separate_rows_(d, col) } 

which gives updated d

  abc 1 1 name1 name7 2 1 name2 name7 3 1 name3 name7 4 2 name4 name8 5 2 name4 name9 6 3 name5 name10 7 3 name6 name10 
+5
source share

Here an alternative approach is used using splitstackshape::cSplit and zoo::na.locf .

 library(splitstackshape) library(zoo) df <- cSplit(d, 1:ncol(d), "long", sep = ",") na.locf(df[rowSums(is.na(df)) != ncol(df),]) # abc #1: 1 name1 name7 #2: 1 name2 name7 #3: 1 name3 name7 #4: 2 name4 name8 #5: 2 name4 name9 #6: 3 name5 name10 #7: 3 name6 name10 
+4
source share

All Articles