Equivalent to cumsum for string in R

I am looking for a way to do what would be the equivalent of a cumulative sum in R for text with a string / character instead of numbers. Different text fields must be combined.

eg. in the data frame "df":

Column A contains the input, column B the desired result.

AB 1 banana banana 2 boats banana boats 3 are banana boats are 4 awesome banana boats are awesome 

I am currently resolving this through the next cycle

 df$B <- "" for(i in 1:nrow(df)) { if (length(df[i-1,"A"]) > 0) { df$B[i] <- paste(df$B[i-1],df$A[i]) } else { df$B[i] <- df$A[i] } } 

I wonder if there is a more elegant / quick solution.

+7
r
source share
3 answers
 (df$B <- Reduce(paste, as.character(df$A), accumulate = TRUE)) # [1] "banana" "banana boats" "banana boats are" "banana boats are awesome" 
+9
source share

I don't know if this is faster, but at least the code is shorter:

 sapply(seq_along(df$A),function(x){paste(A[1:x], collapse=" ")}) 

Thanks to Rolands's remark, I realized that this was one of the rare cases when a for loop could be useful, as it saves us from re-indexing. It differs from OP when it starts with 2, while retaining the need for if status inside forloop.

 res <- c(NA, length(df1$A)) res[1] <- as.character(df1$A[1]) for(i in 2:length(df1$A)){ res[i] <- paste(res[i-1],df1$A[i]) } res 
+4
source share

We can try

  i1 <- sequence(seq_len(nrow(df1))) tapply(df1$A[i1], cumsum(c(TRUE,diff(i1) <=0)), FUN= paste, collapse=' ') 

or

  i1 <- rep(seq(nrow(df1)), seq(nrow(df1))) tapply(i1, i1, FUN= function(x) paste(df1$A[seq_along(x)], collapse=' ') ) 
+4
source share

All Articles