Suppose you are dealing with something like:
mydf <- data.frame( V1 = c("peanut butter sandwich", "peanut butter and jam sandwich"), V2 = c("2 slices of bread 1 tablespoon peanut butter", "2 slices of bread 1 tablespoon peanut butter 1 tablespoon jam")) mydf
You can first add a separator that you don't expect in "V2" and use cSplit from my "splitstackshape" to get a "long" data set format.
library(splitstackshape) mydf$V2 <- gsub(" (\\d+)", "|\\1", mydf$V2) cSplit(mydf, "V2", "|", "long")
Actually, itβs not enough to post as an answer, because they are variations of the @Jota approach, but I pass them here for completeness:
strsplit inside "data.table"
The list partition is automatically flattened into one column ....
library(data.table) as.data.table(mydf)[, list( V2 = unlist(strsplit(as.character(V2), '\\s(?=\\d)', perl=TRUE))), by = V1]
"dplyr" + "tidyr"
You can use unnest from "tidyr" to expand the list column into a long form ....
library(dplyr) library(tidyr) mydf %>% mutate(V2 = strsplit(as.character(V2), " (?=\\d)", perl=TRUE)) %>% unnest(V2)
source share