Help me with my python backprop implementation

Question

Help me with my python backprop implementation

EDIT2:

New training kit ...

Inputs

[ [0.0, 0.0], [0.0, 1.0], [0.0, 2.0], [0.0, 3.0], [0.0, 4.0], [1.0, 0.0], [1.0, 1.0], [1.0, 2.0], [1.0, 3.0], [1.0, 4.0], [2.0, 0.0], [2.0, 1.0], [2.0, 2.0], [2.0, 3.0], [2.0, 4.0], [3.0, 0.0], [3.0, 1.0], [3.0, 2.0], [3.0, 3.0], [3.0, 4.0], [4.0, 0.0], [4.0, 1.0], [4.0, 2.0], [4.0, 3.0], [4.0, 4.0] ]

Outputs:

 [ [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [1.0], [1.0], [0.0], [0.0], [0.0], [1.0], [1.0] ]

EDIT1:

I updated the question with my latest code. I fixed a few minor issues, but still get the same result for all input combinations after the network found out.

Below is the backprop algorithm : Backprop algorithm

Yes, it is homework to make it clear at the very beginning.

I suppose to implement a simple backpropagation algorithm on a simple neural network.

I chose Python as the language of choice for this task, and I selected the neural network as follows:

3 layers: 1 input, 1 hidden, 1 output layer:

 OO O OO

On the output neuron there is an integer on the neutrons inptut and 1 or 0.

Here is my whole implementation (a bit long). Bellow it I will select only shorter matching fragments where, I think, the error can be found at:

 import os import math import Image import random from random import sample #------------------------------ class definitions class Weight: def __init__(self, fromNeuron, toNeuron): self.value = random.uniform(-0.5, 0.5) self.fromNeuron = fromNeuron self.toNeuron = toNeuron fromNeuron.outputWeights.append(self) toNeuron.inputWeights.append(self) self.delta = 0.0 # delta value, this will accumulate and after each training cycle used to adjust the weight value def calculateDelta(self, network): self.delta += self.fromNeuron.value * self.toNeuron.error class Neuron: def __init__(self): self.value = 0.0 # the output self.idealValue = 0.0 # the ideal output self.error = 0.0 # error between output and ideal output self.inputWeights = [] self.outputWeights = [] def activate(self, network): x = 0.0; for weight in self.inputWeights: x += weight.value * weight.fromNeuron.value # sigmoid function if x < -320: self.value = 0 elif x > 320: self.value = 1 else: self.value = 1 / (1 + math.exp(-x)) class Layer: def __init__(self, neurons): self.neurons = neurons def activate(self, network): for neuron in self.neurons: neuron.activate(network) class Network: def __init__(self, layers, learningRate): self.layers = layers self.learningRate = learningRate # the rate at which the network learns self.weights = [] for hiddenNeuron in self.layers[1].neurons: for inputNeuron in self.layers[0].neurons: self.weights.append(Weight(inputNeuron, hiddenNeuron)) for outputNeuron in self.layers[2].neurons: self.weights.append(Weight(hiddenNeuron, outputNeuron)) def setInputs(self, inputs): self.layers[0].neurons[0].value = float(inputs[0]) self.layers[0].neurons[1].value = float(inputs[1]) def setExpectedOutputs(self, expectedOutputs): self.layers[2].neurons[0].idealValue = expectedOutputs[0] def calculateOutputs(self, expectedOutputs): self.setExpectedOutputs(expectedOutputs) self.layers[1].activate(self) # activation function for hidden layer self.layers[2].activate(self) # activation function for output layer def calculateOutputErrors(self): for neuron in self.layers[2].neurons: neuron.error = (neuron.idealValue - neuron.value) * neuron.value * (1 - neuron.value) def calculateHiddenErrors(self): for neuron in self.layers[1].neurons: error = 0.0 for weight in neuron.outputWeights: error += weight.toNeuron.error * weight.value neuron.error = error * neuron.value * (1 - neuron.value) def calculateDeltas(self): for weight in self.weights: weight.calculateDelta(self) def train(self, inputs, expectedOutputs): self.setInputs(inputs) self.calculateOutputs(expectedOutputs) self.calculateOutputErrors() self.calculateHiddenErrors() self.calculateDeltas() def learn(self): for weight in self.weights: weight.value += self.learningRate * weight.delta def calculateSingleOutput(self, inputs): self.setInputs(inputs) self.layers[1].activate(self) self.layers[2].activate(self) #return round(self.layers[2].neurons[0].value, 0) return self.layers[2].neurons[0].value #------------------------------ initialize objects etc inputLayer = Layer([Neuron() for n in range(2)]) hiddenLayer = Layer([Neuron() for n in range(100)]) outputLayer = Layer([Neuron() for n in range(1)]) learningRate = 0.5 network = Network([inputLayer, hiddenLayer, outputLayer], learningRate) # just for debugging, the real training set is much larger trainingInputs = [ [0.0, 0.0], [1.0, 0.0], [2.0, 0.0], [0.0, 1.0], [1.0, 1.0], [2.0, 1.0], [0.0, 2.0], [1.0, 2.0], [2.0, 2.0] ] trainingOutputs = [ [0.0], [1.0], [1.0], [0.0], [1.0], [0.0], [0.0], [0.0], [1.0] ] #------------------------------ let train for i in range(500): for j in range(len(trainingOutputs)): network.train(trainingInputs[j], trainingOutputs[j]) network.learn() #------------------------------ let check for pattern in trainingInputs: print network.calculateSingleOutput(pattern)

Now the problem is that after exploring the network, it seems to return a floating-point number very close to 0.0 for all input combinations, even those that should be close to 1.0.

I train the network in 100 cycles, in each cycle I do:

For each set of inputs in the training set:

Set network inputs
Calculate outputs using sigmoid function
Calculate errors in output level
Calculate hidden layer errors
Calculate weight tacks

Then I adjust the scales based on learning speed and accumulated deltas.

Here is my neuron activation function:

 def activationFunction(self, network): """ Calculate an activation function of a neuron which is a sum of all input weights * neurons where those weights start """ x = 0.0; for weight in self.inputWeights: x += weight.value * weight.getFromNeuron(network).value # sigmoid function self.value = 1 / (1 + math.exp(-x))

This is how I calculate deltas:

 def calculateDelta(self, network): self.delta += self.getFromNeuron(network).value * self.getToNeuron(network).error

This is the general flow of my algorithm:

 for i in range(numberOfIterations): for k,expectedOutput in trainingSet.iteritems(): coordinates = k.split(",") network.setInputs((float(coordinates[0]), float(coordinates[1]))) network.calculateOutputs([float(expectedOutput)]) network.calculateOutputErrors() network.calculateHiddenErrors() network.calculateDeltas() oldWeights = network.weights network.adjustWeights() network.resetDeltas() print "Iteration ", i j = 0 for weight in network.weights: print "Weight W", weight.i, weight.j, ": ", oldWeights[j].value, " ............ Adjusted value : ", weight.value j += j

The last two lines of output:

 0.552785449458 # this should be close to 1 0.552785449458 # this should be close to 0

It actually returns the exit number for all input combinations.

Did I miss something?

+7

python math algorithm neural-network

Richard Knop Oct 21 '10 at 13:59

source share

1 answer

kriss · Accepted Answer · 2010-10-21T14:53:54+0000

It looks like you are getting almost the initial state of Neuron (almost self.idealValue ). Maybe you should not initialize this Neuron before providing the actual data?

EDIT : Well, I looked a little deeper in the code and simplified it a bit (will post a simplified version below). Basically, your code has two minor bugs (it looks like you just don’t notice), but this leads to a network that will definitely not work.

You forgot to set the value of expectOutput in the output layer during the training phase. Without this, the network definitely can’t learn anything and will always focus on the original idealValue. (This is the bhasing that I saw on first reading). You could even notice this in your description of the steps for learning (and it would probably be if you hadn’t posted the code, this is one of the rare cases when I know where the publication of the code actually hid the error, but didn’t make it obviously). You fixed it after your EDIT1.
when you activated the network in calculateSingleOutputs, you forgot to activate the hidden layer.

Obviously, either of these two problems will lead to a dislocation network.

After fixing it works (well, in my simplified version of your code).

The errors were not easy, because the source code was too complicated. You should think twice before introducing new classes or new methods. Without creating sufficient methods or classes, the code will be difficult to read and maintain, but creating too many can make it even harder to read and maintain. You have to find the right balance. My personal way to find this balance is to follow the smell of code and refactoring methods wherever they lead me. Sometimes adding methods or creating classes, sometimes deleting them. This, of course, is not perfect, but what works for me.

Below is my version of the code after applying some refactoring. I spent about an hour changing the code, but always supported its functional equivalent. I took this as a good refactoring exercise, as the source code was really awful read. After refactoring, it took only 5 minutes to identify problems.

 import os import math """ A simple backprop neural network. It has 3 layers: Input layer: 2 neurons Hidden layer: 2 neurons Output layer: 1 neuron """ class Weight: """ Class representing a weight between two neurons """ def __init__(self, value, from_neuron, to_neuron): self.value = value self.from_neuron = from_neuron from_neuron.outputWeights.append(self) self.to_neuron = to_neuron to_neuron.inputWeights.append(self) # delta value, this will accumulate and after each training cycle # will be used to adjust the weight value self.delta = 0.0 class Neuron: """ Class representing a neuron. """ def __init__(self): self.value = 0.0 # the output self.idealValue = 0.0 # the ideal output self.error = 0.0 # error between output and ideal output self.inputWeights = [] # weights that end in the neuron self.outputWeights = [] # weights that starts in the neuron def activate(self): """ Calculate an activation function of a neuron which is a sum of all input weights * neurons where those weights start """ x = 0.0; for weight in self.inputWeights: x += weight.value * weight.from_neuron.value # sigmoid function self.value = 1 / (1 + math.exp(-x)) class Network: """ Class representing a whole neural network. Contains layers. """ def __init__(self, layers, learningRate, weights): self.layers = layers self.learningRate = learningRate # the rate at which the network learns self.weights = weights def training(self, entries, expectedOutput): for i in range(len(entries)): self.layers[0][i].value = entries[i] for i in range(len(expectedOutput)): self.layers[2][i].idealValue = expectedOutput[i] for layer in self.layers[1:]: for n in layer: n.activate() for n in self.layers[2]: error = (n.idealValue - n.value) * n.value * (1 - n.value) n.error = error for n in self.layers[1]: error = 0.0 for w in n.outputWeights: error += w.to_neuron.error * w.value n.error = error for w in self.weights: w.delta += w.from_neuron.value * w.to_neuron.error def updateWeights(self): for w in self.weights: w.value += self.learningRate * w.delta def calculateSingleOutput(self, entries): """ Calculate a single output for input values. This will be used to debug the already learned network after training. """ for i in range(len(entries)): self.layers[0][i].value = entries[i] # activation function for output layer for layer in self.layers[1:]: for n in layer: n.activate() print self.layers[2][0].value #------------------------------ initialize objects etc neurons = [Neuron() for n in range(5)] w1 = Weight(-0.79, neurons[0], neurons[2]) w2 = Weight( 0.51, neurons[0], neurons[3]) w3 = Weight( 0.27, neurons[1], neurons[2]) w4 = Weight(-0.48, neurons[1], neurons[3]) w5 = Weight(-0.33, neurons[2], neurons[4]) w6 = Weight( 0.09, neurons[3], neurons[4]) weights = [w1, w2, w3, w4, w5, w6] inputLayer = [neurons[0], neurons[1]] hiddenLayer = [neurons[2], neurons[3]] outputLayer = [neurons[4]] learningRate = 0.3 network = Network([inputLayer, hiddenLayer, outputLayer], learningRate, weights) # just for debugging, the real training set is much larger trainingSet = [([0.0,0.0],[0.0]), ([1.0,0.0],[1.0]), ([2.0,0.0],[1.0]), ([0.0,1.0],[0.0]), ([1.0,1.0],[1.0]), ([2.0,1.0],[0.0]), ([0.0,2.0],[0.0]), ([1.0,2.0],[0.0]), ([2.0,2.0],[1.0])] #------------------------------ let train for i in range(100): # training iterations for entries, expectedOutput in trainingSet: network.training(entries, expectedOutput) network.updateWeights() #network has learned, let check network.calculateSingleOutput((1, 0)) # this should be close to 1 network.calculateSingleOutput((0, 0)) # this should be close to 0

By the way, there is a third problem that I have not fixed (but easy to fix). If x is too large or too small (> 320 or -320) math.exp() will throw an exception. This will happen if you apply for iterations of learning, say, several thousand. The easiest way to fix what I see is to check the value of x and if it is too big or too small, set the Neuron value to 0 or 1 depending on the case, which is the limit value.

Help me with my python backprop implementation

More articles: