I am trying to implement Asynchronous deep gain teaching methods , and for one of the steps, I need to copy the gradient to different stages, and then apply This. What is the best way to achieve this in tensor flow? I got to the point that I had accumulated the gradient, and I donβt think that this is the fastest way to achieve it (many transfers from tensorflow to python and vice versa). Any suggestions are welcome. This is my NN toy code. It does not simulate and calculate everything that it simply performs the operations that I want to use.
import tensorflow as tf from model import * graph = tf.Graph() with graph.as_default(): state = tf.placeholder(tf.float32, shape=[None, 80,80,1]) with tf.variable_scope('layer1'): W = weight_variable([8, 8, 1, 32]) variable_summaries(W, "layer1/W") b = bias_variable([32]) variable_summaries(b, "layer1/b") h = conv2d(state, W, 4) + b activation = tf.nn.relu(h) pool1 = max_pool_2x2(activation) print(pool1.get_shape()) pool1 = tf.reshape(pool1, [-1, 3200]) with tf.variable_scope('readout'): W = weight_variable([3200, 3]) b = bias_variable([3]) logits = tf.matmul(pool1, W) + b variable_summaries(h, "y") action_indexes = tf.placeholder(tf.int32, shape=[None], name="action_indexes") loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits, action_indexes) starter_learning_rate = 1e-6 global_step = tf.Variable(0, trainable=False)
Updated code after @yaroslave suggestion:
import tensorflow as tf from model import * graph = tf.Graph() with graph.as_default(): minibatch = 32 state = tf.placeholder(tf.float32, shape=[minibatch, 80,80,1], name="input") with tf.variable_scope('layer1'): W = weight_variable([8, 8, 1, 32]) variable_summaries(W, "layer1/W") b = bias_variable([32]) variable_summaries(b, "layer1/b") h = conv2d(state, W, 4) + b activation = tf.nn.relu(h) pool1 = max_pool_2x2(activation) print(pool1.get_shape()) pool1 = tf.reshape(pool1, [-1, 3200]) with tf.variable_scope('readout'): W = weight_variable([3200, 3]) b = bias_variable([3]) logits = tf.matmul(pool1, W) + b variable_summaries(h, "y") action_indexes = tf.placeholder(tf.int32, shape=[minibatch], name="action_indexes") loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits, action_indexes) starter_learning_rate = 1e-6 global_step = tf.Variable(0, trainable=False)