Are the built-in functions in R usually optimized?

I have some written code for calculating the correlation coefficient in R. However, I only found out that the “boot” package offers corr () functions that do the same job. Are the built-in functions in R that are usually more efficient and faster than the equivalent we write from scratch?

Thank.

+5
source share
4 answers

I don’t think there is one specific answer to this question, as it will vary greatly depending on the specific function that you are asking for. Some functions in the added packages are added as a convenience and are just wrappers around the basic functions. Others are added to enhance basic functionality or to eliminate any other perceived deficit in basic functions. Some, as you suggest, are added to improve computing time or to increase efficiency. And others are added because the authors of the package providers feel that the solutions in the R database are in some way erroneous.

In the case of stats:::corand boot:::corr, it seems that the latter adds the ability to weigh. This does not necessarily look faster:

> dat <- matrix(rnorm(1e6), ncol = 2)
> system.time(
+ cor(dat[, 1],dat[, 2])
+ )
   user  system elapsed 
   0.01    0.00    0.02 
> system.time(
+ corr(dat)
+ )
   user  system elapsed 
   0.11    0.00    0.11 
+5

- (.. ) , R C (++) Fortran- - .Internal, .External, .C, .Fortran .Call , , , . , , R .

, , -. , 1 10 , , , , 90% .

+3

Chase, , , . . . , .
, , OP cor R, . ?cor.

: , , , rowSums apply sum. , , ( , ), . , ., , .

, , , R: R - , , , .

, , , , , , ( C Fortran). , ( Hadley Wickham plyr apply).

+2

, R "", , .. , lm glm lm.fit glm.fit . cor .Internal(cor(x, y, na.method, FALSE)) . (1) (2) , , :

library(rbenchmark)
x <- y <- runif(1000)
benchmark(cor(x,y),.Internal(cor(x,y,4,FALSE)),replications=10000)
                            test replications elapsed relative user.self
1                      cor(x, y)        10000   1.131 5.004425     1.136
2 .Internal(cor(x, y, 4, FALSE))        10000   0.226 1.000000     0.224

- : , , ( , , )...

x <- y <- rnorm(5e5)
benchmark(cor(x,y),.Internal(cor(x,y,4,FALSE)),replications=500)
                            test replications elapsed relative user.self
1                      cor(x, y)          500   5.402 1.013889     5.384
2 .Internal(cor(x, y, 4, FALSE))          500   5.328 1.000000     5.316
+2
source

All Articles