TensorFlow: Non-duplicate results

Problem

I have a Python script that uses TensorFlow to create a multi-layer perceptron network (with dropout) to perform binary classification. Despite the fact that I tried to install both Python seeds and TensorFlow, I get unique results. If I run once and then run again, I get different results. I can even run once, exit Python, restart Python, start again and get different results.

What i tried

I know that some people asked questions about getting unique results in TensorFlow (for example, "How to get stable results ..." , "set_random_seed doesn't work ..." , "How to get reproducible results in TensorFlow" ), and the answers usually turn out to be misused / understood by tf.set_random_seed() . I tried to implement the provided solutions, but this did not solve my problem.

It is not a common mistake to recognize that tf.set_random_seed() is just a graph-level seed, and that running the script will change the graph several times, explaining non-repetitive results. I used the following statement to print the entire graph and verify (via diff) that the graph is the same even when the results are different.

 print [n.name for n in tf.get_default_graph().as_graph_def().node] 

I also used function calls like tf.reset_default_graph() and tf.get_default_graph().finalize() to avoid any changes to the graph, although this is probably too large.

Code (relevant)

My script is ~ 360 lines, so here are the relevant lines (with the indicated abbreviated code). Any elements that are in ALL_CAPS are constants that are defined in my Parameters block below.

 import numpy as np import tensorflow as tf from copy import deepcopy from tqdm import tqdm # Progress bar # --------------------------------- Parameters --------------------------------- (snip) # --------------------------------- Functions --------------------------------- (snip) # ------------------------------ Obtain Train Data ----------------------------- (snip) # ------------------------------ Obtain Test Data ----------------------------- (snip) random.seed(12345) tf.set_random_seed(12345) (snip) # ------------------------- Build the TensorFlow Graph ------------------------- tf.reset_default_graph() with tf.Graph().as_default(): x = tf.placeholder("float", shape=[None, N_INPUT]) y_ = tf.placeholder("float", shape=[None, N_CLASSES]) # Store layers weight & bias weights = { 'h1': tf.Variable(tf.random_normal([N_INPUT, N_HIDDEN_1])), 'h2': tf.Variable(tf.random_normal([N_HIDDEN_1, N_HIDDEN_2])), 'h3': tf.Variable(tf.random_normal([N_HIDDEN_2, N_HIDDEN_3])), 'out': tf.Variable(tf.random_normal([N_HIDDEN_3, N_CLASSES])) } biases = { 'b1': tf.Variable(tf.random_normal([N_HIDDEN_1])), 'b2': tf.Variable(tf.random_normal([N_HIDDEN_2])), 'b3': tf.Variable(tf.random_normal([N_HIDDEN_3])), 'out': tf.Variable(tf.random_normal([N_CLASSES])) } # Construct model pred = multilayer_perceptron(x, weights, biases, USE_DROP_LAYERS, DROP_KEEP_PROB) mean1 = tf.reduce_mean(weights['h1']) mean2 = tf.reduce_mean(weights['h2']) mean3 = tf.reduce_mean(weights['h3']) cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred, y_)) regularizers = (tf.nn.l2_loss(weights['h1']) + tf.nn.l2_loss(biases['b1']) + tf.nn.l2_loss(weights['h2']) + tf.nn.l2_loss(biases['b2']) + tf.nn.l2_loss(weights['h3']) + tf.nn.l2_loss(biases['b3'])) cost += COEFF_REGULAR * regularizers optimizer = tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(cost) out_labels = tf.nn.softmax(pred) sess = tf.InteractiveSession() sess.run(tf.initialize_all_variables()) tf.get_default_graph().finalize() # Lock the graph as read-only #Print the default graph in text form print [n.name for n in tf.get_default_graph().as_graph_def().node] # --------------------------------- Training ---------------------------------- print "Start Training" pbar = tqdm(total = TRAINING_EPOCHS) for epoch in range(TRAINING_EPOCHS): avg_cost = 0.0 batch_iter = 0 train_outfile.write(str(epoch)) while batch_iter < BATCH_SIZE: train_features = [] train_labels = [] batch_segments = random.sample(train_segments, 20) for segment in batch_segments: train_features.append(segment[0]) train_labels.append(segment[1]) sess.run(optimizer, feed_dict={x: train_features, y_: train_labels}) line_out = "," + str(batch_iter) + "\n" train_outfile.write(line_out) line_out = ",," + str(sess.run(mean1, feed_dict={x: train_features, y_: train_labels})) line_out += "," + str(sess.run(mean2, feed_dict={x: train_features, y_: train_labels})) line_out += "," + str(sess.run(mean3, feed_dict={x: train_features, y_: train_labels})) + "\n" train_outfile.write(line_out) avg_cost += sess.run(cost, feed_dict={x: train_features, y_: train_labels})/BATCH_SIZE batch_iter += 1 line_out = ",,,,," + str(avg_cost) + "\n" train_outfile.write(line_out) pbar.update(1) # Increment the progress bar by one train_outfile.close() print "Completed training" # ------------------------------ Testing & Output ------------------------------ keep_prob = 1.0 # Do not use dropout when testing print "now reducing mean" print(sess.run(mean1, feed_dict={x: test_features, y_: test_labels})) print "TRUE LABELS" print(test_labels) print "PREDICTED LABELS" pred_labels = sess.run(out_labels, feed_dict={x: test_features}) print(pred_labels) output_accuracy_results(pred_labels, test_labels) sess.close() 

What does not repeat

As you can see, I output the results in each era to a file, and also printed precision numbers at the end. None of these matches are from run to run, although I believe that I installed the seed correctly. I used both random.seed(12345) and tf.set_random_seed(12345)

Please let me know if I need to provide more information. And thanks for any help.

-DG

Setup Information

TensorFlow version 0.8.0 (processor only)
Enthought Canopy version 1.7.2 (Python 2.7, not 3. +)
Mac OS X Version 10.11.3

+13
source share
5 answers

You need to set the initial value of the process level in addition to the semantics of the graph level, i.e.

 tf.reset_default_graph() a = tf.constant([1, 1, 1, 1, 1], dtype=tf.float32) graph_level_seed = 1 operation_level_seed = 1 tf.set_random_seed(graph_level_seed) b = tf.nn.dropout(a, 0.5, seed=operation_level_seed) 
+11
source

See this tensor github problem . Some operations on the GPU are not completely determined (speed and accuracy).

I also noticed that in order for the seed to have any effect, tf.set_random_seed(...) needs to be called before creating the Session . And also you must either completely restart the python interpreter each time you run your code, or call tf.reset_default_graph() at the beginning.

+9
source

Just add the answer to Yaroslav , you must also set the numpy seed in addition to the operation and graph seeds, as some database operations are numpy dependent. This did the trick for me np.random.seed() with Tensorflow V 1.1.0

0
source

What I did to get reproducible results of training and testing a huge network using tensor flow.

  • This is tested on Ubuntu 16.04 , tenorflow 1.9.0 , python 2.7 , both on the GPU and CPU
  • Add these lines of code before doing anything in your code (first few lines of the main function)
 import os import random import numpy as np import tensorflow as tf SEED = 1 # use this constant seed everywhere os.environ['PYTHONHASHSEED'] = str(SEED) random.seed(SEED) # 'python' built-in pseudo-random generator np.random.seed(SEED) # numpy pseudo-random generator tf.set_random_seed(SEED) # tensorflow pseudo-random generator 
  • Reset schedule to default before session
 tf.reset_default_graph() # this goes before sess = tf.Session() 
  • Find all the tensor flow functions in your code that take seed as an argument, put your constant seed in all of them ( SEED used in my code)

Here are a few of these features: tf.nn.dropout , tf.contrib.layers.xavier_initializer , etc.

Note This step may seem unjustified, because we already use tf.set_random_seed to set the initial number for the tensor flow, but, believe me, you need it! See the answer of Yaroslav .

0
source

In tf.set_random_seed(42) 2.0, tf.set_random_seed(42) changed to tf.random.set_seed(42) .

https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/random/set_seed

This should be the only seed needed when using TensorFlow.

0
source

All Articles