B R Variable importance values ​​of a random forest model

What am I doing wrong here? What does “cross-border indexing” mean?

I got an excerpt from the code below (first block) in the Revolution R online seminar devoted to data analysis in R. I am trying to incorporate this into the RF model that I was running, but I can’t get through, I think this is an ordering variable. I just want to talk about the importance of variables.

I have included a little more than needed below to give context. But actually I'm wrong, this is the third line of code. The second code block is the errors that I get in relation to the data I work with. Can someone help me figure this out?

------------------------------------------------------------------------- # List the importance of the variables. rn <- round(importance(model.rf), 2) rn[order(rn[,3], decreasing=TRUE),] ##@# of # Plot variable importance varImpPlot(model.rf, main="",col="dark blue") title(main="Variable Importance Random Forest weather.csv", sub=paste(format(Sys.time(), "%Y-%b-%d %H:%M:%S"), Sys.info()["user"])) #-------------------------------------------------------------------------- 

My mistakes:

 > rn[order(rn[,2], decreasing=TRUE),] Error in order(rn[, 2], decreasing = TRUE) : subscript out of bounds 
+4
source share
1 answer

I think I understand the confusion. I bet 4 Kit Kit coats, if you type ncol(rn) , you'll see that rn has 2 columns, not 3, as you might expect. The first “column” you see on the screen is not really a column - it's just the row names for the rn object. Enter rownames(rn) to confirm this. The last column rn that you want to order is therefore rn [, 2], not rn [, 3]. The message "subscript out of bounds" appears because you asked R to order in column 3, but rn does not have column 3.

Here is my short detective trail for anyone interested in what is actually an object of "importance" ... I installed the library (randomforest) and then ran an example from the online documentation:

 set.seed(4543) data(mtcars) mtcars.rf <- randomForest(mpg ~ ., data=mtcars, ntree=1000, keep.forest=FALSE, importance=TRUE) importance(mtcars.rf) 

Turns off the “importance” object in this case looks like this (the first few lines only to save space):

  %IncMSE IncNodePurity cyl 17.058932 181.70840 disp 19.203139 242.86776 hp 17.708221 191.15919 ... 

Obviously, ncol (value (mtcars.rf)) is 2, and line names are likely to be the cause of the confusion :)

+6
source

Source: https://habr.com/ru/post/1416271/


All Articles