Ns is changing for no apparent reason

My results of using splines::ns with a subset of the least squares changed without the rhyme or reason I could see, and I think I traced this problem to the ns function itself.

I solved the problem:

 require(splines) N <- 0 set.seed(1) for (i in 1:100) N <- N + identical(ns(1:10,3),ns(1:10,3)) N 

My results averaged around 39, range 34--44 or so, but I expected 100 every time. Why should ns results be random? If I replace bs with ns in both places, I will get 100, as expected. My set.seed(1) hoping to demonstrate that the randomness I get is not what R. intended.

In a clean session, using RStudio and R version 2.14.2 (2012-02-29), I get 39, 44, 38, etc. Everyone else seems to get 100.

Additional Information:

Substituting splines::ns for ns gives the same results. A pure vanilla session gives the same results. My computer has 8 cores.

The differences when they occur are usually or always 2 ^ -54:

 Max <- 0 for (i in 1:1000) Max <- max( Max, abs(ns(1:10,3)-ns(1:10,3)) ) c(Max,2^-54) 

with the result of [1] 5.551115e-17 5.551115e-17 . This variability causes me big problems along the line, because my optimize(...)$min now sometimes changes even in the first digit, making the results not repeatable.

My InfoInfo with a clean vanilla session:

I created what I understand is called a pure vanilla session using

 > .Last <- function() system("R --vanilla") > q("no") 

This resets the session, and when I restart it, I get my pure vanilla session. Then, in response to a question from Ben Bolker, I did this at the beginning of my clean vanilla session:

 > sessionInfo() R version 2.14.2 (2012-02-29) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] Revobase_6.1.0 RevoMods_6.1.0 RevoScaleR_3.1-0 lattice_0.20-0 [5] rpart_3.1-51 loaded via a namespace (and not attached): [1] codetools_0.2-8 foreach_1.4.0 grid_2.14.2 iterators_1.0.6 [5] pkgXMLBuilder_1.0 revoIpe_1.0 tools_2.14.2 XML_3.9-1.1 > require(splines) Loading required package: splines > N <- 0 > set.seed(1) > for (i in 1:100) N <- N + identical(ns(1:10,3),ns(1:10,3)) > N [1] 32 
+3
source share
1 answer

This is the answer I received from REVolution technical support (posted here with permission):

The problem here is the problem of floating point arithmetic. Revolution R uses the Intel mkl BLAS library for some calculations, which is different from what CRAN-R uses and uses this library for 'ns ()' calculations. In this case, you will also get different results depending on whether you are calculating on a processor based on Intel processors or a machine with an AMD chipset.

We send the same BLAS and Lapack DLLs that come with CRAN-R, but they are not standard for Revolution R. Customers can return the installed DLL if they want to and prefer it by doing the following:

1). Renaming "Rblas.dll" to "Rblas.dll.bak" and "Rlapack.dll" to 'Rlapack.dll.bak' in the folder "C: \ Revolution \ R-Enterprise-6.1 \ R-2.14.2 \ bin \ x64 "

2). Rename the files Rblas.dll.0 and Rlapack.dll.0 to this folder to Rblas.dll and Rlpack.dll, respectively.

Their suggestion worked great. I renamed these files back and forth several times, using both RStudio (with revolution R) and IDE Revolution R, always with the same result: BLAS DLLs give me N==40 or so , and CRAN-R DLLs give me N==100 .

I will probably go back to BLAS because in my tests it is 8 times faster for %*% and 4 times faster for svd() . And it’s just used by one of my cores (checked by the CPU usage column of the Processes tab of the Windows Task Manager).

I hope that someone with a better understanding can write a better answer, because I still do not quite understand all the consequences of this.

+1
source

All Articles