Do.call ("rbind", list (data, frames)), but also index each row by the original data frame

df1 <- data.frame(a = 1:2, b = 3:4) df2 <- data.frame(a = 5:6, b = 7:8) # A common method loses the origin of each row. do.call("rbind", list(df1, df2)) ## ab ## 1 1 3 ## 2 2 4 ## 3 5 7 ## 4 6 8 # Whereas here, X1 records which data frame each row originated in. library(plyr) adply(list(df1, df2), 1) ## X1 ab ## 1 1 1 3 ## 2 1 2 4 ## 3 2 5 7 ## 4 2 6 8 

Are there other ways to do this, perhaps more efficiently?

+5
source share
2 answers

Here is one way.

 library(dplyr) library(tidyr) foo <- list(df1, df2) unnest(foo, names) %>% mutate(names = gsub("^X", "", names)) # names ab #1 1 1 3 #2 1 2 4 #3 2 5 7 #4 2 6 8 
+2
source

With base:

 df1 <- data.frame(a = 1:2, b = 3:4) df2 <- data.frame(a = 5:6, b = 7:8) frames <- list(df1, df2) do.call(rbind, lapply(seq_along(frames), function(x) { frames[[x]]$X1 <- x frames[[x]] })) ## ab X1 ## 1 1 3 1 ## 2 2 4 1 ## 3 5 7 2 ## 4 6 8 2 

As an aside, if you want to see how plyr it has gander in (plyr::adply) , (plyr:::splitter_a) and (plyr::ldply) . These answers are trivial compared to this :-)

+1
source

Source: https://habr.com/ru/post/1211126/


All Articles