I have this problem: after one iteration, almost all of my parameters (cost function, weight function, hypothesis function, etc.) output "NaN". My code is similar to the MNIST-Expert tensor flow tutorial ( https://www.tensorflow.org/versions/r0.9/tutorials/mnist/pros/index.html ). I searched for solutions already and still tried: to reduce the learning speed to zero and set it to zero using AdamOptimizer instead of gradient descent, using the sigmoid function for the hypothesis function at the last level and using only the numpy functions. I have some negative and zero values ββin my input, so I cannot use logarithmic cross-entropy instead of the quadratic value function. The result is the same, but all input data consists of stresses and deformations of soils.
import tensorflow as tf import Datafiles3_pv_complete as soil import numpy as np m_training = int(18.0) m_cv = int(5.0) m_test = int(5.0) total_examples = 28 " range for running " range_training = xrange(0,m_training) range_cv = xrange(m_training,(m_training+m_cv)) range_test = xrange((m_training+m_cv),total_examples) """ Using interactive Sessions""" sess = tf.InteractiveSession() """ creating input and output vectors """ x = tf.placeholder(tf.float32, shape=[None, 11]) y_true = tf.placeholder(tf.float32, shape=[None, 3]) """ Standard Deviation Calculation""" stdev = np.divide(2.0,np.sqrt(np.prod(x.get_shape().as_list()[1:]))) """ Weights and Biases """ def weights(shape): initial = tf.truncated_normal(shape, stddev=stdev) return tf.Variable(initial) def bias(shape): initial = tf.truncated_normal(shape, stddev=1.0) return tf.Variable(initial) """ Creating weights and biases for all layers """ theta1 = weights([11,7]) bias1 = bias([1,7]) theta2 = weights([7,7]) bias2 = bias([1,7]) "Last layer" theta3 = weights([7,3]) bias3 = bias([1,3]) """ Hidden layer input (Sum of weights, activation functions and bias) z = theta^T * activation + bias """ def Z_Layer(activation,theta,bias): return tf.add(tf.matmul(activation,theta),bias) """ Creating the sigmoid function sigmoid = 1 / (1 + exp(-z)) """ def Sigmoid(z): return tf.div(tf.constant(1.0),tf.add(tf.constant(1.0), tf.exp(tf.neg(z)))) """ hypothesis functions - predicted output """ ' layer 1 - input layer ' hyp1 = x ' layer 2 ' z2 = Z_Layer(hyp1, theta1, bias1) hyp2 = Sigmoid(z2) ' layer 3 ' z3 = Z_Layer(hyp2, theta2, bias2) hyp3 = Sigmoid(z3) ' layer 4 - output layer ' zL = Z_Layer(hyp3, theta3, bias3) hypL = tf.add( tf.add(tf.pow(zL,3), tf.pow(zL,2) ), zL) """ Cost function """ cost_function = tf.mul( tf.div(0.5, m_training), tf.pow( tf.sub(hypL, y_true), 2))
In recent weeks, I have been working on the Unit model for this problem, but the same result has occurred. I have no idea what to try next. Hope someone can help me.
Edit:
I checked some parameters again. The hypothesis function (hyp) and the activation function (z) for levels 3 and 4 (last level) have the same entries for each data point, i.e. the same value in each row for one column.