R: Line up trees from h2o.randomForest () and h2o.gbm ()

Question

R: Line up trees from h2o.randomForest () and h2o.gbm ()

Look for an effective way to build trees in rstudio, H2O Flow or on the local html page from h2o RF and GBM models, similar to those shown in the picture below. In particular, how do you draw trees for the objects (models suitable) rf1 and gbm2 generated by the code below, perhaps by parsing h2o.download_pojo (rf1) or h2o.download_pojo (gbm1)?

# # The following two commands remove any previously installed H2O packages for R. # if ("package:h2o" %in% search()) { detach("package:h2o", unload=TRUE) } # if ("h2o" %in% rownames(installed.packages())) { remove.packages("h2o") } # # Next, we download packages that H2O depends on. # pkgs <- c("methods","statmod","stats","graphics","RCurl","jsonlite","tools","utils") # for (pkg in pkgs) { # if (! (pkg %in% rownames(installed.packages()))) { install.packages(pkg) } # } # # # Now we download, install h2o package # install.packages("h2o", type="source", repos=(c("http://h2o-release.s3.amazonaws.com/h2o/rel-turchin/3/R"))) library(h2o) h2o.init(nthreads = -1, max_mem_size = "2G") h2o.removeAll() ##clean slate - just in case the cluster was already running ## Load data - available to download from link below ## https://www.dropbox.com/s/gu8e2o0mzlozbu4/SampleData.csv?dl=0 df <- h2o.importFile(path = normalizePath("../SampleData.csv")) splits <- h2o.splitFrame(df, c(0.4, 0.3), seed = 1234) train <- h2o.assign(splits[[1]], "train.hex") valid <- h2o.assign(splits[[2]], "valid.hex") test <- h2o.assign(splits[[2]], "test.hex") predictor_col_start_pos <- 2 predictor_col_end_pos <- 169 predicted_col_pos <- 1 rf1 <- h2o.randomForest(training_frame = train, validation_frame = valid, x = predictor_col_start_pos:predictor_col_end_pos, y = predicted_col_pos, model_id = "rf_covType_v1", ntrees = 2000, stopping_rounds = 10, score_each_iteration = T, seed = 2001) gbm1 <- h2o.gbm(training_frame = train, validation_frame = valid, x = predictor_col_start_pos:predictor_col_end_pos, y = predicted_col_pos, model_id = "gbm_covType2", seed = 2002, ntrees = 20, learn_rate = 0.2, max_depth = 10, stopping_rounds = 2, stopping_tolerance = 0.01, score_each_iteration = T) ## Next step would be to plot trees for fitted models rf1 and gbm2 # print the model, POJO (Plain Old Java Object) to screen h2o.download_pojo(rf1) h2o.download_pojo(gbm1)

+6

r data-visualization random-forest h2o gbm

Webby May 04 '16 at 12:25

source share

1 answer

Raul vp · Accepted Answer · 2017-03-19T09:42:04+0000

I think this may be the solution you are looking for;

 library(h2o) h2o.init() df = h2o.importFile("http://s3.amazonaws.com/h2o-public-test-data/smalldata/airlines/allyears2k_headers.zip") model = h2o.gbm(model_id = "model", training_frame = df, x = c("Year", "Month", "DayofMonth", "DayOfWeek", "UniqueCarrier"), y = "IsDepDelayed", max_depth = 3, ntrees = 5) h2o.download_mojo(model, getwd(), FALSE)

Now download the latest stable version of h2o from http://www.h2o.ai/download/ and run the PrintMojo tool from the command line.

 java -cp h2o.jar hex.genmodel.tools.PrintMojo --tree 0 -i model.zip -o model.gv dot -Tpng model.gv -o model.png

open model.png

Additional information: http://docs.h2o.ai/h2o/latest-stable/h2o-genmodel/javadoc/index.html

R: Line up trees from h2o.randomForest () and h2o.gbm ()

More articles: