A faster way to multiply in a data frame

I have a data frame (name t) like this

ID N com_a com_b com_c A 3 1 0 0 A 5 0 1 0 B 1 1 0 0 B 1 0 1 0 B 4 0 0 1 B 4 1 0 0 

I'm trying to make com_a*N com_b*N com_c*N

 ID N com_a com_b com_c com_a_N com_b_N com_c_N A 3 1 0 0 3 0 0 A 5 0 1 0 0 5 0 B 1 1 0 0 1 0 0 B 1 0 1 0 0 1 0 B 4 0 0 1 0 0 4 B 4 1 0 0 4 0 0 

I use the for function, but I need a lot of time, how to do big data quickly

 for (i in 1:dim(t)[1]){ t$com_a_N[i]=t$com_a[i]*t$N[i] t$com_b_N[i]=t$com_b[i]*t$N[i] t$com_c_N[i]=t$com_c[i]*t$N[i] } 
+1
r
source share
5 answers
 t <- transform(t, com_a_N=com_a*N, com_b_N=com_b*N, com_c_N=com_c*N) 

should be much faster. data.table solutions can be faster.

+4
source share

You can use sweep for this

 (st <- sweep(t[, 3:5], 1, t$N, "*")) # com_a com_b com_c #1 3 0 0 #2 0 5 0 #3 1 0 0 #4 0 1 0 #5 0 0 4 #6 4 0 0 

New names can be created using paste and setNames , and you can add new columns to an existing data.frame using cbind . This will scale for any number of columns.

 cbind(t, setNames(st, paste(names(st), "N", sep="_"))) # ID N com_a com_b com_c com_a_N com_b_N com_c_N #1 A 3 1 0 0 3 0 0 #2 A 5 0 1 0 0 5 0 #3 B 1 1 0 0 1 0 0 #4 B 1 0 1 0 0 1 0 #5 B 4 0 0 1 0 0 4 #6 B 4 1 0 0 4 0 0 
+4
source share

A data.table solution proposed by @BenBolker

 library(data.table) setDT(t)[, c("com_a_N", "com_b_N", "com_c_N") := list(com_a*N, com_b*N, com_c*N)] ## ID N com_a com_b com_c com_a_N com_b_N com_c_N ## 1: A 3 1 0 0 3 0 0 ## 2: A 5 0 1 0 0 5 0 ## 3: B 1 1 0 0 1 0 0 ## 4: B 1 0 1 0 0 1 0 ## 5: B 4 0 0 1 0 0 4 ## 6: B 4 1 0 0 4 0 0 
+3
source share

Even faster using matrix multiplication:

 cbind(dat,dat[,3:5]*dat$N) 

Although you should set the column names after ....

To avoid using an explicit column index (not recommended), you can use grep magic:

 cbind(dat,dat[,grep('com',colnames(dat))]*dat$N) 
+2
source share

Another option with dplyr :

 require(dplyr) t <- mutate(t, com_a_N=com_a*N, com_b_N=com_b*N, com_c_N=com_c*N) 
+1
source share

All Articles