Caret and GBM Errors

Question

Caret and GBM Errors

I am trying to use a carriage package in R for several nested cross-validation processes with "user-defined" performance metrics. I had all kinds of problems, so I stepped back to see if there were problems with a lot of box use in the carriage, and it looks like I ran into it.

If I run the following:

install.packages("caret") install.packages("gbm") library(caret) library(gbm) data(GermanCredit) GermanCredit$Class<-ifelse(GermanCredit$Class=='Bad',1,0) gbmGrid <- expand.grid(.interaction.depth = 1, .n.trees = 150, .shrinkage = 0.1) gbmMOD <- train(Class~., data=GermanCredit ,method = "gbm", tuneGrid= gbmGrid, distribution="bernoulli", bag.fraction = 0.5, train.fraction = 0.5, n.minobsinnode = 10, cv.folds = 1, keep.data=TRUE, verbose=TRUE )

I get an error (or similar):

  Error in { : task 1 failed - "arguments imply differing number of rows: 619, 381"

with warnings:

 1: In eval(expr, envir, enclos) : model fit failed for Resample01: interaction.depth=1, n.trees=150, shrinkage=0.1

But, if I run only the gbm procedure, everything ends perfectly.

 gbm1 <- gbm(Class~., data=GermanCredit, distribution="bernoulli", n.trees=150, # number of trees shrinkage=0.10, interaction.depth=1, bag.fraction = 0.5, train.fraction = 0.5, n.minobsinnode = 10, cv.folds = 1, keep.data=TRUE, verbose=TRUE )

+4

r

B_miner Feb 10 '13 at 21:04

source share

2 answers

Just for a note - although this problem was caused by the reason described in the answer, the error message (below) can also occur with an older version of the carriage and gbm. I encountered this error and, having spent a lot of time figuring out what the problem was, it turned out I had to upgrade to the latest version of cat (5.17-7) and gbm (2.1-0.1). This is the latest version to date on CRAN.

 Error in { : task 1 failed - "arguments imply differing number of rows: ...

0

xbsd Oct 11 '13 at 21:37

source share

topepo · Accepted Answer · 2013-08-05T12:48:07+0000

There were two problems: passing cv.folds caused the problem. In addition, you do not need to convert the result to a binary number; this makes train think this is a regression problem. The idea behind the train function is to smooth out inconsistencies with modeling functions, so we use factors for classification and numbers for regression.

Caret and GBM Errors

More articles: