R, DMwR packet, SMOTE function will not work

I need to apply the smote algorithm to a dataset, but cannot make it work.

Example:

x <- c(12,13,14,16,20,25,30,50,75,71) y <- c(0,0,1,1,1,1,1,1,1,1) frame <- data.frame(x,y) library(DMwR) smotedobs <- SMOTE(y~ ., frame, perc.over=300) 

This results in the following error:

 Error in scale.default(T, T[i, ], ranges) : subscript out of bounds In addition: Warning messages: 1: In FUN(newX[, i], ...) : no non-missing arguments to max; returning -Inf 2: In FUN(newX[, i], ...) : no non-missing arguments to min; returning Inf 

Will use any help or tips.

+4
source share
3 answers

I do not have a complete answer. I can indicate one more hint:

If you convert 'y' to a coefficient, SMOTE will return without errors, but the synthesized observations have NA values ​​for x.

+3
source

SMOTE has a 32-bit error in OS Win7. Assume that the target variable in the “form” parameter is the last column in the data set, the following code will explain

 library(DMwR) data(iris) # data <- iris[, c(1, 2, 5)] # SMOTE work data <- iris[, c(2, 5, 1)] # SMOTE bug data$Species <- factor(ifelse(data$Species == "setosa", "rare", "common")) head(data) table(data$Species) newData <- SMOTE(Species ~., data, perc.over=600, perc.under=100) table(newData$Species) 

The following message will appear.

Error in colnames<- ( *tmp* , value = c ("Sepal.Width", "Species", "Sepal.Length": The attribute 'names' [3] must be the same length as the vector [2]

In Win7 64bit, the order problem does not occur!

+3
source

There is an error in the SMOTE code. He assumes that the function y that she feeds is already a factor variable, currently she does not handle the extreme case of non-factors. Before calling this method, make sure that it is passed to the coefficient.

+2
source

All Articles