XOR with neural networks (Matlab)

Question

XOR with neural networks (Matlab)

So, I hope this is the real stupid thing I do, and there is a simple answer. I am trying to train a 2x3x1 neural network to solve the XOR problem. It didn’t work, so I decided to stop by to see what was happening. Finally, I decided to assign weight to myself. This was the weight vector that I came up with:

theta1 = [11 0 -5; 0 12 -7;18 17 -20]; theta2 = [14 13 -28 -6];

(In Matlab notation). I intentionally tried to make two weights the same (except for zeros)

And my code is very simple in matlab,

 function layer2 = xornn(iters) if nargin < 1 iters = 50 end function s = sigmoid(X) s = 1.0 ./ (1.0 + exp(-X)); end T = [0 1 1 0]; X = [0 0 1 1; 0 1 0 1; 1 1 1 1]; theta1 = [11 0 -5; 0 12 -7;18 17 -20]; theta2 = [14 13 -28 -6]; for i = [1:iters] layer1 = [sigmoid(theta1 * X); 1 1 1 1]; layer2 = sigmoid(theta2 * layer1) delta2 = T - layer2; delta1 = layer1 .* (1-layer1) .* (theta2' * delta2); % remove the bias from delta 1. There no real point in a delta on the bias. delta1 = delta1(1:3,:); theta2d = delta2 * layer1'; theta1d = delta1 * X'; theta1 = theta1 - 0.1 * theta1d; theta2 = theta2 - 0.1 * theta2d; end end

I think so. I tested various parameters (theta) with the finite difference method to make sure they were right, and they seemed to be.

But, when I run it, in the end it all comes down to returning all zeros. If I do xornn (1) (for 1 iteration), I get

 0.0027 0.9966 0.9904 0.0008

But if I do xornn (35)

 0.0026 0.9949 0.9572 0.0007

(He started the descent in the wrong direction), and by the time I get to hornn (45), I get

 0.0018 0.0975 0.0000 0.0003

If I ran it for 10,000 iterations, it just returns all 0.

What's happening? Should I add regularization? I would have thought that such a simple network would not be needed. But, regardless of why he is moving away from the obvious good decision I received from him?

Thanks!

+2

matlab neural-network xor

bnsh May 11 '15 at 9:06

source share

1 answer

bnsh · Accepted Answer · 2015-05-11T19:40:20+0000

AAARRGGHHH! The decision was just a matter of change.

 theta1 = theta1 - 0.1 * theta1d; theta2 = theta2 - 0.1 * theta2d;

to

 theta1 = theta1 + 0.1 * theta1d; theta2 = theta2 + 0.1 * theta2d;

sigh

Now, I need to find out how I calculate the negative derivative in some way, when what I thought I was calculating was ... Nothing. I’ll post it here anyway, just in case it helps someone else.

So, z = the sum of the inputs of the sigmoid, and y is the output of the sigmoid.

 C = -(T * Log[y] + (1-T) * Log[(1-y)) dC/dy = -((T/y) - (1-T)/(1-y)) = -((T(1-y)-y(1-T))/(y(1-y))) = -((T-Ty-y+Ty)/(y(1-y))) = -((Ty)/(y(1-y))) = ((yT)/(y(1-y))) # This is the source of all my woes. dy/dz = y(1-y) dC/dz = ((yT)/(y(1-y))) * y(1-y) = (yT)

So the problem is that I accidentally calculated Ty, because I forgot about the negative sign in front of the cost function. Then I subtracted what I thought was a gradient, but was actually a negative gradient. And there. This is problem.

As soon as I did this:

 function layer2 = xornn(iters) if nargin < 1 iters = 50 end function s = sigmoid(X) s = 1.0 ./ (1.0 + exp(-X)); end T = [0 1 1 0]; X = [0 0 1 1; 0 1 0 1; 1 1 1 1]; theta1 = [11 0 -5; 0 12 -7;18 17 -20]; theta2 = [14 13 -28 -6]; for i = [1:iters] layer1 = [sigmoid(theta1 * X); 1 1 1 1]; layer2 = sigmoid(theta2 * layer1) delta2 = T - layer2; delta1 = layer1 .* (1-layer1) .* (theta2' * delta2); % remove the bias from delta 1. There no real point in a delta on the bias. delta1 = delta1(1:3,:); theta2d = delta2 * layer1'; theta1d = delta1 * X'; theta1 = theta1 + 0.1 * theta1d; theta2 = theta2 + 0.1 * theta2d; end end

xornn (50) returns 0.0028 0.9972 0.9948 0.0009 and xornn (10000) returns 0.0016 0.9989 0.9993 0.0005

Phew! Perhaps this will help someone else in debugging their version.

XOR with neural networks (Matlab)

More articles: