Neuralnet package in R big mistake

I am trying to figure out how to get the neuralnet package to work. I conducted several tests with the data that I created and with their results (about 50 rows of data and three columns, the fourth is the result that I wanted, and it was made from simple mathematical executions, such as summing up the other three columns) and still so good. Then I decided to apply the package to real data. I downloaded the mpg dataset from here http://vincentarelbundock.imtqy.com/Rdatasets/datasets.html

I used the code below:

net<- neuralnet(cty~displ+year+cyl+hwy,
                datain, hidden=3)

Even if I have 3 hidden layers, either 8 or 18, the error is the same, and the time when the packet processes the data is relatively small from this amount of data (234 lines):

        Error Reached Threshold Steps
1 2110.173077    0.006277805853    54

What is good advice for this?

+4
source share
1 answer

This is a large-scale problem, I think you can normalize or scale it. There are differences between scalingand normalizing, this will affect your results and there is a separate question about SO:

normalize inputs

norm.fun = function(x){ 
  (x - min(x))/(max(x) - min(x)) 
}

require(ggplot2) # load mpg dataset
require(neuralnet)

data = mpg[, c('cty', 'displ', 'year', 'cyl', 'hwy')]
data.norm = apply(data, 2, norm.fun)

net = neuralnet(cty ~ displ + year + cyl + hwy, data.norm, hidden = 2)

Then you can denormalize the data

# restore data 
y.net = min(data[, 'cty']) + net$net.result[[1]] * range(data[, 'cty'])
plot(data[, 'cty'], col = 'red')
points(y.net)

enter image description here

scale inputs

data.scaled = scale(data)
net = neuralnet(cty ~ displ + year + cyl + hwy, data.scaled, hidden = 2)

# restore data 
y.sd = sd(data[, 'cty'])
y.mean = mean(data[, 'cty'])

y.net = net$net.result[[1]] * y.sd + y.mean
plot(data[, 'cty'], col = 'red')
points(y.net)

enter image description here

You can also try the nnet package, it is very fast:

require(nnet)

data2 = mpg
data2$year = scale(data2$year)
fit = nnet(cty ~ displ + year + cyl + hwy, size = 10, data = data2, linout = TRUE)
plot(mpg$cty)
points(fit$fitted.values, col = 'red')
+8
source

All Articles