Here is the code I'm using. I am trying to get 1, 0 or, hopefully, the probability as a result of a real test suite. When I just split the training set and run it on the training set, I get an accuracy rate of 93%, but when I train the program and run it on the test set itself (the one with 1 and 0 not filled in column 1) it does not return anything except nan.
import tensorflow as tf import numpy as np from numpy import genfromtxt import sklearn # Convert to one hot def convertOneHot(data): y=np.array([int(i[0]) for i in data]) y_onehot=[0]*len(y) for i,j in enumerate(y): y_onehot[i]=[0]*(y.max() + 1) y_onehot[i][j]=1 return (y,y_onehot) data = genfromtxt('cs-training.csv',delimiter=',') # Training data test_data = genfromtxt('cs-test-actual.csv',delimiter=',') # Actual test data #This part is to get rid of the nan at the start of the actual test data g = 0 for i in test_data: i[0] = 1 test_data[g] = i g += 1 x_train=np.array([ i[1::] for i in data]) y_train,y_train_onehot = convertOneHot(data) x_test=np.array([ i[1::] for i in test_data]) y_test,y_test_onehot = convertOneHot(test_data) A=data.shape[1]-1 # Number of features, Note first is y B=len(y_train_onehot[0]) tf_in = tf.placeholder("float", [None, A]) # Features tf_weight = tf.Variable(tf.zeros([A,B])) tf_bias = tf.Variable(tf.zeros([B])) tf_softmax = tf.nn.softmax(tf.matmul(tf_in,tf_weight) + tf_bias) # Training via backpropagation tf_softmax_correct = tf.placeholder("float", [None,B]) tf_cross_entropy = -tf.reduce_sum(tf_softmax_correct*tf.log(tf_softmax)) # Train using tf.train.GradientDescentOptimizer tf_train_step = tf.train.GradientDescentOptimizer(0.01).minimize(tf_cross_entropy) # Add accuracy checking nodes tf_correct_prediction = tf.equal(tf.argmax(tf_softmax,1), tf.argmax(tf_softmax_correct,1)) tf_accuracy = tf.reduce_mean(tf.cast(tf_correct_prediction, "float")) saver = tf.train.Saver([tf_weight,tf_bias]) # Initialize and run init = tf.initialize_all_variables() sess = tf.Session() sess.run(init) print("...") # Run the training for i in range(1): sess.run(tf_train_step, feed_dict={tf_in: x_train, tf_softmax_correct: y_train_onehot}) #print y_train_onehot saver.save(sess, 'trained_csv_model') ans = sess.run(tf_softmax, feed_dict={tf_in: x_test}) print ans #Print accuracy #result = sess.run(tf_accuracy, feed_dict={tf_in: x_test, tf_softmax_correct: y_test_onehot}) #print result
When I type ans , I get the following.
[[ nan nan] [ nan nan] [ nan nan] ..., [ nan nan] [ nan nan] [ nan nan]]
I don't know what I'm doing wrong here. All I want is ans to get 1, 0 or, especially, an array of probabilities, where each unit inside the array has a length of 2.
I do not expect that many people will be able to answer this question for me, but please try at least. I was stuck here, waiting for the moment of a brilliant moment that did not appear after 2 days, so I thought I would ask. Thanks!
test_data looks like this:
[[ 1.00000000e+00 8.85519080e-01 4.30000000e+01 ..., 0.00000000e+00 0.00000000e+00 0.00000000e+00] [ 1.00000000e+00 4.63295269e-01 5.70000000e+01 ..., 4.00000000e+00 0.00000000e+00 2.00000000e+00] [ 1.00000000e+00 4.32750360e-02 5.90000000e+01 ..., 1.00000000e+00 0.00000000e+00 2.00000000e+00] ..., [ 1.00000000e+00 8.15963730e-02 7.00000000e+01 ..., 0.00000000e+00 0.00000000e+00 nan] [ 1.00000000e+00 3.35456547e-01 5.60000000e+01 ..., 2.00000000e+00 1.00000000e+00 3.00000000e+00] [ 1.00000000e+00 4.41841663e-01 2.90000000e+01 ..., 0.00000000e+00 0.00000000e+00 0.00000000e+00]]
And the only reason that the first element in the data is 1 is because I got rid of the nan that filled this position in order to avoid errors. Note that everything after the first column is a function. The first column is what I'm trying to predict.
EDIT:
I changed the code to the following -
import tensorflow as tf import numpy as np from numpy import genfromtxt import sklearn from sklearn.cross_validation import train_test_split from tensorflow import Print
After listing, I see one of the Boolean objects. I donβt know if this is a problem, but look at the following and see if there is a way you can help.
Tensor("Print_16:0", shape=TensorShape([Dimension(2)]), dtype=float32) Tensor("Print_17:0", shape=TensorShape([Dimension(10), Dimension(2)]), dtype=float32) Tensor("Print_18:0", shape=TensorShape([Dimension(None), Dimension(10)]), dtype=float32) Tensor("Print_19:0", shape=TensorShape([Dimension(None), Dimension(2)]), dtype=float32) Tensor("Placeholder_9:0", shape=TensorShape([Dimension(None), Dimension(2)]), dtype=float32) Tensor("Equal_4:0", shape=TensorShape([Dimension(None)]), dtype=bool) Tensor("Mean_4:0", shape=TensorShape([]), dtype=float32) ... [[ nan nan] [ nan nan] [ nan nan] ..., [ nan nan] [ nan nan] [ nan nan]]