Since version 3 R supports long vectors. The long vector is indexed by double . A long vector can be the base for a matrix or array larger than 2, since each dimension is small enough to be indexed using an integer . Long vectors cannot be passed to native code via .C and .Fortran . The error message you receive is due to the fact that a long vector is passed through .C .
Long vectors can be passed through .Call . Thus, as long as glmnet native code can support long vectors (64-bit indices) or can be modified / compiled to support it, you only need to change the interface between R and glmnet native code. You can do this manually in C, and there is also a new package for this task called dotCall64 . Part of the interface modification decides when to copy the arguments -.C / .Fortran proactively copies, but you don't want to do this without the need for large data structures.
I think that the difficulty of modifying glmnet's native code to support 64-bit indexes depends on the actual code (which I only looked at, but didn't work). It's easy to switch all integers (either explicitly or implicitly 32-bit integers) in Fortran code to 64-bit. Problems arise when some integers must remain 32 bits, and this will happen, for example. for integer vectors passed from / to the R code, since R uses 32-bit integers (even in long vectors). Glmnet has such integer vectors. How complicated the modification is then depends on how clean the Fortran source code is (for example, if it uses separate integer variables to index and access the values โโof whole arrays, etc.).
Experimental implementations of R subsets, like Riposte, will not help.
source share