The main neural network in TensorFlow

I am trying to implement a very basic neural network in TensorFlow, but I am having some problems. This is a very basic network that takes values ​​(hours or sleep and training hours) as input and predicts points on the test (I found this example on your handset). So basically I only have one hidden layer with three units, each of which calculates an activation function (sigmoid), and the cost function is the sum of square errors, and I use gradient descent to minimize it. So the problem is that when I train the network with the training data and try to make some predictions using the same training data, the results do not quite match, and they also seem strange because they look the same to each other .

import tensorflow as tf import numpy as np import input_data sess = tf.InteractiveSession() # create a 2-D version of input for plotting trX = np.matrix(([3,5], [5,1],[10,2]), dtype=float) trY = np.matrix(([85], [82], [93]), dtype=float) # 3X1 matrix trX = trX / np.max(trX, axis=0) trY = trY / 100 # 100 is the maximum score allowed teX = np.matrix(([3,5]), dtype=float) teY = np.matrix(([85]), dtype=float) teX = teX/np.amax(teX, axis=0) teY = teY/100 def init_weights(shape): return tf.Variable(tf.random_normal(shape, stddev=0.01)) def model(X, w_h, w_o): z2 = tf.matmul(X, w_h) a2 = tf.nn.sigmoid(z2) # this is a basic mlp, think 2 stacked logistic regressions z3 = tf.matmul(a2, w_o) yHat = tf.nn.sigmoid(z3) return yHat # note that we dont take the softmax at the end because our cost fn does that for us X = tf.placeholder("float", [None, 2]) Y = tf.placeholder("float", [None, 1]) W1 = init_weights([2, 3]) # create symbolic variables W2 = init_weights([3, 1]) sess.run(tf.initialize_all_variables()) py_x = model(X, W1, W2) cost = tf.reduce_mean(tf.square(py_x - Y)) train_op = tf.train.GradientDescentOptimizer(0.5).minimize(cost) # construct an optimizer predict_op = py_x sess.run(train_op, feed_dict={X: trX, Y: trY}) print sess.run(predict_op, feed_dict={X: trX}) sess.close() 

This gives:

[[0.51873487] [0.51874501] [0.51873082]]

and I believe that it should be similar to the results of training data.

I am completely new to neural networks and machine learning, so forgive me for any mistakes, thanks in advance.

+8
tensorflow
source share
1 answer

The main reason your network is not training is because the instruction:

 sess.run(train_op, feed_dict={X: trX, Y: trY}) 

... is executed only once. In TensorFlow, running train_op (or any operation returned from Optimizer.minimize() will cause the network to take one step of lowering the gradient. You must loop it to do iterative training, and the weights will eventually converge.

Two other tips: (i) you can achieve faster convergence if you submit a subset of your training data at each step, and not the entire data set; and (ii) a learning rate of 0.5 is probably too high (although this is data dependent).

+9
source share

All Articles