I am trying to train LSTM in Tensorflow using mini-quotes, but after completing the training, I would like to use the model by presenting it with one example at a time. I can set up a graph in Tensorflow to train my LSTM network, but after that I can not use the result as I want.
The setup code looks something like this:
#Build the LSTM model. cellRaw = rnn_cell.BasicLSTMCell(LAYER_SIZE) cellRaw = rnn_cell.MultiRNNCell([cellRaw] * NUM_LAYERS) cell = rnn_cell.DropoutWrapper(cellRaw, output_keep_prob = 0.25) input_data = tf.placeholder(dtype=tf.float32, shape=[SEQ_LENGTH, None, 3]) target_data = tf.placeholder(dtype=tf.float32, shape=[SEQ_LENGTH, None]) initial_state = cell.zero_state(batch_size=BATCH_SIZE, dtype=tf.float32) with tf.variable_scope('rnnlm'): output_w = tf.get_variable("output_w", [LAYER_SIZE, 6]) output_b = tf.get_variable("output_b", [6]) outputs, final_state = seq2seq.rnn_decoder(input_list, initial_state, cell, loop_function=None, scope='rnnlm') output = tf.reshape(tf.concat(1, outputs), [-1, LAYER_SIZE]) output = tf.nn.xw_plus_b(output, output_w, output_b)
... Pay attention to two placeholders, input_data and target_data. I was not worried, including tuning the optimizer. After completing the training and closing the training, I would like to create a new session that uses the trained LSTM network, the input of which is provided by a completely different placeholder, for example:
with tf.Session() as sess: with tf.variable_scope("simulation", reuse=None): cellSim = cellRaw input_data_sim = tf.placeholder(dtype=tf.float32, shape=[1, 1, 3]) initial_state_sim = cell.zero_state(batch_size=1, dtype=tf.float32) input_list_sim = tf.unpack(input_data_sim) outputsSim, final_state_sim = seq2seq.rnn_decoder(input_list_sim, initial_state_sim, cellSim, loop_function=None, scope='rnnlm') outputSim = tf.reshape(tf.concat(1, outputsSim), [-1, LAYER_SIZE]) with tf.variable_scope('rnnlm'): output_w = tf.get_variable("output_w", [LAYER_SIZE, nOut]) output_b = tf.get_variable("output_b", [nOut]) outputSim = tf.nn.xw_plus_b(outputSim, output_w, output_b)
This second part returns the following error:
tensorflow.python.framework.errors.InvalidArgumentError: You must feed a value for placeholder tensor 'Placeholder' with dtype float [[Node: Placeholder = Placeholder[dtype=DT_FLOAT, shape=[], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
... Presumably due to the fact that the schedule I am using still has old learning places attached to trained LSTM nodes. What is the right way to โextractโ a trained LSTM and place it in a new, different chart that is different in input style? The scalable scope capabilities that Tensorflow seems to affect are similar, but the examples in the documentation all talk about using a variable scope as a way to manage variable names, so the same piece of code will generate similar subgraphs within the same graph. The reuse feature seems to be close to what I want, but I don't think the Tensorflow documentation above is absolutely clear on what it does. The cells themselves cannot be given a name (in other words,
cellRaw = rnn_cell.MultiRNNCell([cellRaw] * NUM_LAYERS, name="multicell")
is invalid), and although I can give the name seq2seq.rnn_decoder (), I probably could not remove rnn_cell.DropoutWrapper () if I used node without changes.
Questions:
What is the right way to move prepared LSTM weights from one graph to another?
Is it possible to say that starting a new session "frees up resources", but does not erase a graph built in memory?
It seems to me that the "reuse" function allows Tensorflow to look for variables with the same name (existing in another area) outside the region of the current variable and use them in the current region. It's right? If so, what happens to all edges of the graph from a non-current region that refers to this variable? If this is not the case, why does Tensorflow throw an error if you try to get the same variable name in two different areas? It seems quite reasonable to define two variables with the same name in two different areas, for example. conv1 / sum1 and conv2 / sum1.
In my code, I work in a new area, but the chart will not work without data that will be submitted to the placeholder from the original area by default. For some reason, the default scope is always โin scopeโ?
If graph graphs can span different scopes, and names in different scopes cannot be divided, unless they belong to the exact same node, then this will apparently win in order to use different scopes. What i don't understand here?
Thanks!