How to get sparse matrices in H2O?

I am trying to get a sparse matrix in H2O and I was wondering if this is possible. Suppose we have:

test <- Matrix(c(1,0,0,1,1,1,1,0,1), nrow = 3, sparse = TRUE) 

and assuming my local H2O is localH2O , I cannot do the following:

 as.h2o(test) 

Throws an error: cannot coerce class "structure("dgCMatrix", package = "Matrix")" to a data.frame . This seems pretty logical, however, assuming the test is so large that I cannot convert it to a dataframe, how can I load it into H2O? Using sparse matrix representation is only 500 MB.

How can I load a sparse matrix in H2O?

+7
r sparse-matrix h2o
source share
1 answer

It is very difficult to transfer the data stored in the R memory to the H2O memory for essentially two reasons: R performs a POST file for streaming data to H2O, which 1) does not take advantage of the H2O parallel reader and 2) limits your data to existing in R.

Instead, use the h2o.importFile method from R to use an H2O parallel reader. Your data can live anywhere: HDFS, S3, regular file system ...

H2O is an SVMLight reader, so it is recommended that you save your sparse matrix from R in svmlight format.

Hope this helps!

+7
source share

All Articles