Efficient column multiplication in a data frame

I have a large data frame in which I multiply two columns together to get another column. First, I ran a for loop, for example:

for(i in 1:nrow(df)){ df$new_column[i] <- df$column1[i] * df$column2[i] } 

but it takes 9 days.

Another alternative was plyr, and I could use variables incorrectly:

  new_df <- ddply(df, .(column1,column2), transform, new_column = column1 * column2) # but this is taking forever 
+7
source share
3 answers

As the Blue Master said in the comments,

 df$new_column <- df$column1 * df$column2 

should work fine. Of course, we cannot know for sure whether we have sample data.

+19
source

A small, slightly less efficient version of Sacha Answer is to use transform() or within()

 df <- transform(df, new = column1 * column2) 

or

 df <- within(df, new <- column1 * column2) 

(I hate splashing my user code with $ .)

+10
source

A data.table solution will avoid a lot of internal copying, while having the advantage of not spattering code with $ .

  library(data.table) DT <- data.table(df) DT[ , new := column1 * column2] 
+10
source

All Articles