I have a large data frame in which I multiply two columns together to get another column. First, I ran a for loop, for example:
for(i in 1:nrow(df)){ df$new_column[i] <- df$column1[i] * df$column2[i] }
but it takes 9 days.
Another alternative was plyr, and I could use variables incorrectly:
new_df <- ddply(df, .(column1,column2), transform, new_column = column1 * column2) # but this is taking forever
As the Blue Master said in the comments,
df$new_column <- df$column1 * df$column2
should work fine. Of course, we cannot know for sure whether we have sample data.
A small, slightly less efficient version of Sacha Answer is to use transform() or within()
transform()
within()
df <- transform(df, new = column1 * column2)
or
df <- within(df, new <- column1 * column2)
(I hate splashing my user code with $ .)
$
A data.table solution will avoid a lot of internal copying, while having the advantage of not spattering code with $ .
data.table
library(data.table) DT <- data.table(df) DT[ , new := column1 * column2]