I have come across this before. We should not use ifelse() all the time. If you look at how ifelse is written by typing โifelseโ in your R console, you will see that this function is written in R, and it performs various checks, which is really inefficient.
Instead of using ifelse() we can do this:
getScore <- function(history, similarities) {
And then check the profiling result again:
Rprof("foo.out") for (i in (1:10)) getScore(history, similarities) Rprof(NULL) summaryRprof("foo.out") # $by.total # total.time total.pct self.time self.pct # "getScore" 2.10 100.00 0.88 41.90 # "abs" 0.32 15.24 0.32 15.24 # "*" 0.26 12.38 0.26 12.38 # "sum" 0.26 12.38 0.26 12.38 # "<" 0.14 6.67 0.14 6.67 # "-" 0.14 6.67 0.14 6.67 # "!" 0.06 2.86 0.06 2.86 # "is.na" 0.04 1.90 0.04 1.90 # $sample.interval # [1] 0.02 # $sampling.time # [1] 2.1
We have a 2+ increase in productivity . In addition, the profile is more like a flat profile, without any one part dominating the runtime.
In R, vector indexing / reading / writing is done at the speed of the C code, so whenever we can, use a vector.
Testing @ Matveyevsky answer
mat_getScore <- function(history, similarities) { ######## old code ####### # nh <- ifelse(similarities < 0, 6 - history, history) ######## old code ####### ######## new code ####### ind <- similarities < 0 nh <- ind*(6-history) + (!ind)*history ######## new code ####### x <- nh * abs(similarities) contados <- !is.na(history) sum(x, na.rm=TRUE) / sum(abs(similarities[contados]), na.rm = TRUE) } Rprof("foo.out") for (i in (1:10)) mat_getScore(history, similarities) Rprof(NULL) summaryRprof("foo.out") # $by.total # total.time total.pct self.time self.pct # "mat_getScore" 2.60 100.00 0.24 9.23 # "*" 0.76 29.23 0.76 29.23 # "!" 0.40 15.38 0.40 15.38 # "-" 0.34 13.08 0.34 13.08 # "+" 0.26 10.00 0.26 10.00 # "abs" 0.20 7.69 0.20 7.69 # "sum" 0.18 6.92 0.18 6.92 # "<" 0.16 6.15 0.16 6.15 # "is.na" 0.06 2.31 0.06 2.31 # $sample.interval # [1] 0.02 # $sampling.time # [1] 2.6
A? Slower?
The full profiling result shows that this approach spends more time on floating-point multiplication "*" , and the logical one is not "!" seems pretty expensive. Although my approach only requires floating point addition / subtraction.
Well, the result may also be architecture dependent. I'm currently testing Intel Nahalem (Intel Core 2 Duo). Therefore, benchmarking between the two approaches on different platforms is welcome.
Note
All profiling uses OP data in question.