Keras neural network outputs the same result for each input

I tried to implement a direct neural network.

This is the structure: Input layer: 8 neurons, Hidden layer: 8 neurons and output layer: 8 neurons.

The input data are vectors of 8 bits (1 bit for each neuron of the input layer). Neural network outputs are also 8-bit vectors. Thus, a common dataset contains 256 examples.

Example: if x = [0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0] is specified

the output should be y = [1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0]

This is the implementation:

from keras.models import Sequential from keras.layers import Dense import numpy as np import random from math import ceil #Dimension of layers dim = 8 #Generate dataset X = [] for i in range(0,2**dim): n = [float(x) for x in bin(i)[2:]] X.append([0.]*(dim-len(n))+n) y = X[:] random.shuffle(y) X = np.array(X) y = np.array(y) # create model model = Sequential() model.add(Dense(dim, input_dim=dim, init='normal', activation='sigmoid')) model.add(Dense(dim, init='normal', activation='sigmoid')) model.add(Dense(dim, init='normal', activation='sigmoid')) # Compile model model.compile(loss='mse', optimizer='SGD', metrics=['accuracy']) # Fit the model model.fit(X, y, nb_epoch=1000, batch_size=50, verbose=0) # evaluate the model scores = model.evaluate(X, y) print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100)) output = model.predict(X) #Make the output binary for i in range(0, output[:,0].size): for j in range(0, output[0].size): if output[i][j] > 0.5 or output[i][j] == 0.5: output[i][j] = 1 else: output[i][j] = 0 print(output) 

This is what I get in the output:

 acc: 50.39% [[ 1. 0. 0. ..., 0. 1. 1.] [ 1. 0. 0. ..., 0. 1. 1.] [ 1. 0. 0. ..., 0. 1. 1.] ..., [ 1. 0. 0. ..., 0. 1. 1.] [ 1. 0. 0. ..., 0. 1. 1.] [ 1. 0. 0. ..., 0. 1. 1.]] 

It seems that all outputs have the same meaning. Therefore, I do not know what is wrong with the setup. I have tried this https://stackoverflow.com/a/2908371/ which offers to remove the activation function at the output level, but when I run it, I get all the output vectors with this value:

[0. 1. 1 ...., 1. 1. 1.]

Any ideas on how to make it work?

+5
source share
3 answers

The output is relatively similar to a multi-label classification, so I would recommend:

  • Change loss function to binary_crossentropy
  • Save the last activation layer as sigmoid and change the rest - relu may be a good choice.
  • Add confirmation to your “appropriate” challenge and increase verbosity. This will allow you to understand how your network changes through epochs and especially when it is above / below fits
  • Add depth to the net until you cover
  • Add regularization to your network until you repack
  • repeat 4 + 5
+5
source

I had the same problem.

I would advise you to reduce the learning speed for SGD. In my case, I used Adam Optimizer with lr = 0.001, but changing to 0.0001 solved the problem.

The default options for SGD are:

sgd = SGD (lr = 0.01, decay = 1e-6, momentum = 0.9, nesterov = True)

+2
source

If you tried all of the above and it doesn’t work, it means that you are trying to establish noise, there is no connection / correlation / relevance between your inputs and outputs.

+1
source

All Articles