I tried an example. For what it's worth, he agrees with the user's statement that inserting rows into a data frame is also very slow. I do not quite understand what is happening, since I expected the distribution problem to cause copy speed. Can someone repeat this or explain why the results below (rbind & ltend appending <insertion) would be true at all or explain why this is not a typical example (for example, a data frame too small)?
edit : the first time I forgot to initialize an object in hell2fun a data frame, so the code performed operations on matrices, not operations on data frames, which are much faster. If I get a chance, I will continue the comparison with the data frame and matrix. However, the qualitative statements in the first paragraph remain.
N <- 1000 set.seed(101) r <- matrix(runif(2*N),ncol=2) ## second circle of hell hell2fun <- function() { df <- as.data.frame(rbind(r[1,])) ## initialize for (i in 2:N) { df <- rbind(df,r[i,]) } } insertfun <- function() { df <- data.frame(x=rep(NA,N),y=rep(NA,N)) for (i in 1:N) { df[i,] <- r[i,] } } rsplit <- as.list(as.data.frame(t(r))) rbindfun <- function() { do.call(rbind,rsplit) } library(rbenchmark) benchmark(hell2fun(),insertfun(),rbindfun()) ## test replications elapsed relative user.self ## 1 hell2fun() 100 32.439 484.164 31.778 ## 2 insertfun() 100 45.486 678.896 42.978 ## 3 rbindfun() 100 0.067 1.000 0.076
source share