Using a cost-sensitive C50 in a carriage

I use a train in a carriage package to train some c50 models. I manage to do a great job with the C5.0 method, but when I want to use the expensive C50 method, I try to understand how to adjust the cost parameter. What I'm trying to do is enter the value when predicting the wrong one of my classes. I tried searching on the carriage package website ( http://topepo.imtqy.com/caret/index.html ) and read some manuals / tutorials found here and there. I did not find any information on how to handle the cost parameter. So here is what I tried myself:

  • Run the train with the default settings to see what I get. At the output, the train function tried with a cost from 0 to 2 and gave the best model for cost = 2.

  • Try adding the value in the form of a matrix to the expand.grid function, just like with the C5.0 package. The code is below (tests push to 1, because I just want to have one tree / set of rules in my output)

    c50Grid <- expand.grid (.trials = 1, .model = c ("tree", "rules") ,. winnow = c ("TRUE", "FALSE") ,. cost = matrix (c (0,1 , 2.0), ncol = 2))

However, when I perform the function of the train, although I do not receive any errors (but I get 50 warnings), the train again tried to cost from 0 to 2. What am I doing wrong? What format does the cost parameter have? What is the point here? How would I interpret the results? Which class is the one that gets the value like "Predicting class 0 with the wrong value is twice as much as class 1"? In addition, I tried to use one matrix, but although it did not work with this format, how would I add various costs that I want to check?

Thank! Any help would be really appreciated!


Edit:

, C5.0Cost, C5.0Cost.R(https://r-forge.r-project.org/scm/viewvc.php/models/files/C5.0Cost.R?view=markup&root=caret&pathrev=761) . :

cmat <-matrix(c(0, param$cost, 1, 0), ncol = 2)

, . , , . = {0,1}, - 0, , " 0 , 1", ? : ? , " 1 , 0", :

cmat <- matrix(c(0, 1, param$cost, 0), ncol=2)

0,5? , 1 {0,5, 0,6, 0,7 ..}. : , C50 , "Positive class= 0", , C50, , C5.0Cost, ...

. !

+4
2

train C5.0 senstivite ( method = "C5.0Cost"). :

library(caret)

set.seed(1)
dat1 <- twoClassSim(1000, intercept = -12)
dat2 <- twoClassSim(1000, intercept = -12)

stats <- function (data, lev = NULL, model = NULL)  {
  c(postResample(data[, "pred"], data[, "obs"]),
    Sens = sensitivity(data[, "pred"], data[, "obs"]),
    Spec = specificity(data[, "pred"], data[, "obs"]))
}

ctrl <- trainControl(method = "repeatedcv", repeats = 5,
                     summaryFunction = stats)

set.seed(2)
mod1 <- train(Class ~ ., data = dat1, 
              method = "C5.0",
              tuneGrid = expand.grid(model = "tree", winnow = FALSE,
                                     trials = c(1:10, (1:5)*10)),
              trControl = ctrl)

xyplot(Sens + Spec ~ trials, data = mod1$results, 
       type = "l",
       auto.key = list(columns = 2, 
                       lines = TRUE, 
                       points = FALSE))

set.seed(2)
mod2 <- train(Class ~ ., data = dat1, 
              method = "C5.0Cost",
              tuneGrid = expand.grid(model = "tree", winnow = FALSE,
                                     trials = c(1:10, (1:5)*10),
                                     cost = 1:10),
              trControl = ctrl)

xyplot(Sens + Spec ~ trials|format(cost), data = mod2$results, 
       type = "l",
       auto.key = list(columns = 2, 
                       lines = TRUE, 
                       points = FALSE))

Max

+3

= {0,1}, 0, , " 0 , 1", ? : ? , " 1 , 0" [...]?

, . , ! . post.

+1

All Articles