Okay, so I'm in the middle of a course on Andrew Ng machines in the course and would like to adapt a neural network that was completed as part of assignment 4.
In particular, the neural network, which I correctly completed as part of the task, was as follows:
- Sigmoid activation function:
g(z) = 1/(1+e^(-z)) - 10 units of output, each of which can take 0 or 1
- 1 hidden layer
- Back propagation method used to minimize the cost function.
- Cost function:

where L=number of layers , s_l = number of units in layer l , m = number of training examples , K = number of output units
Now I want to configure the exercise so that there is one continuous output block that takes any value between [0,1], and I'm trying to figure out what needs to be changed, so far I have
- Replaced the data with my own, i.e. so that the output is a continuous variable from 0 to 1
- Updated links to the number of output units
- Updated cost function in backpropagation algorithm:
where a_3 is the value of the output block determined by direct distribution.
I'm sure something else should change, since the gradient test method shows a gradient determined by back propagation, and that the numerical approximation no longer matches. I have not changed the sigmoid gradient; it remains at f(z)*(1-f(z)) , where f(z) is the sigmoid function 1/(1+e^(-z))) , and I did not update the numerical approximation of the derivative formula; just (J(theta+e) - J(theta-e))/(2e) .
Can someone tell me what other steps will be required?
Matlab encoded as follows:
% FORWARD PROPAGATION % input layer a1 = [ones(m,1),X]; % hidden layer z2 = a1*Theta1'; a2 = sigmoid(z2); a2 = [ones(m,1),a2]; % output layer z3 = a2*Theta2'; a3 = sigmoid(z3); % BACKWARD PROPAGATION delta3 = a3 - y; delta2 = delta3*Theta2(:,2:end).*sigmoidGradient(z2); Theta1_grad = (delta2'*a1)/m; Theta2_grad = (delta3'*a2)/m; % COST FUNCTION J = 1/(2 * m) * sum( (a3-y).^2 ); % Implement regularization with the cost function and gradients. Theta1_grad(:,2:end) = Theta1_grad(:,2:end) + Theta1(:,2:end)*lambda/m; Theta2_grad(:,2:end) = Theta2_grad(:,2:end) + Theta2(:,2:end)*lambda/m; J = J + lambda/(2*m)*( sum(sum(Theta1(:,2:end).^2)) + sum(sum(Theta2(:,2:end).^2)));
From then on, I realized that this question is similar to this question asked by https://stackoverflow.com/a/29569/... however, in this case I want the continuous variable to be between 0 and 1 and therefore use a sigmoid function.