Tensor flow: use of weights trained by one model inside another, different model

I am trying to train LSTM in Tensorflow using mini-quotes, but after completing the training, I would like to use the model by presenting it with one example at a time. I can set up a graph in Tensorflow to train my LSTM network, but after that I can not use the result as I want.

The setup code looks something like this:

#Build the LSTM model. cellRaw = rnn_cell.BasicLSTMCell(LAYER_SIZE) cellRaw = rnn_cell.MultiRNNCell([cellRaw] * NUM_LAYERS) cell = rnn_cell.DropoutWrapper(cellRaw, output_keep_prob = 0.25) input_data = tf.placeholder(dtype=tf.float32, shape=[SEQ_LENGTH, None, 3]) target_data = tf.placeholder(dtype=tf.float32, shape=[SEQ_LENGTH, None]) initial_state = cell.zero_state(batch_size=BATCH_SIZE, dtype=tf.float32) with tf.variable_scope('rnnlm'): output_w = tf.get_variable("output_w", [LAYER_SIZE, 6]) output_b = tf.get_variable("output_b", [6]) outputs, final_state = seq2seq.rnn_decoder(input_list, initial_state, cell, loop_function=None, scope='rnnlm') output = tf.reshape(tf.concat(1, outputs), [-1, LAYER_SIZE]) output = tf.nn.xw_plus_b(output, output_w, output_b) 

... Pay attention to two placeholders, input_data and target_data. I was not worried, including tuning the optimizer. After completing the training and closing the training, I would like to create a new session that uses the trained LSTM network, the input of which is provided by a completely different placeholder, for example:

 with tf.Session() as sess: with tf.variable_scope("simulation", reuse=None): cellSim = cellRaw input_data_sim = tf.placeholder(dtype=tf.float32, shape=[1, 1, 3]) initial_state_sim = cell.zero_state(batch_size=1, dtype=tf.float32) input_list_sim = tf.unpack(input_data_sim) outputsSim, final_state_sim = seq2seq.rnn_decoder(input_list_sim, initial_state_sim, cellSim, loop_function=None, scope='rnnlm') outputSim = tf.reshape(tf.concat(1, outputsSim), [-1, LAYER_SIZE]) with tf.variable_scope('rnnlm'): output_w = tf.get_variable("output_w", [LAYER_SIZE, nOut]) output_b = tf.get_variable("output_b", [nOut]) outputSim = tf.nn.xw_plus_b(outputSim, output_w, output_b) 

This second part returns the following error:

 tensorflow.python.framework.errors.InvalidArgumentError: You must feed a value for placeholder tensor 'Placeholder' with dtype float [[Node: Placeholder = Placeholder[dtype=DT_FLOAT, shape=[], _device="/job:localhost/replica:0/task:0/cpu:0"]()]] 

... Presumably due to the fact that the schedule I am using still has old learning places attached to trained LSTM nodes. What is the right way to โ€œextractโ€ a trained LSTM and place it in a new, different chart that is different in input style? The scalable scope capabilities that Tensorflow seems to affect are similar, but the examples in the documentation all talk about using a variable scope as a way to manage variable names, so the same piece of code will generate similar subgraphs within the same graph. The reuse feature seems to be close to what I want, but I don't think the Tensorflow documentation above is absolutely clear on what it does. The cells themselves cannot be given a name (in other words,

 cellRaw = rnn_cell.MultiRNNCell([cellRaw] * NUM_LAYERS, name="multicell") 

is invalid), and although I can give the name seq2seq.rnn_decoder (), I probably could not remove rnn_cell.DropoutWrapper () if I used node without changes.

Questions:

What is the right way to move prepared LSTM weights from one graph to another?

Is it possible to say that starting a new session "frees up resources", but does not erase a graph built in memory?

It seems to me that the "reuse" function allows Tensorflow to look for variables with the same name (existing in another area) outside the region of the current variable and use them in the current region. It's right? If so, what happens to all edges of the graph from a non-current region that refers to this variable? If this is not the case, why does Tensorflow throw an error if you try to get the same variable name in two different areas? It seems quite reasonable to define two variables with the same name in two different areas, for example. conv1 / sum1 and conv2 / sum1.

In my code, I work in a new area, but the chart will not work without data that will be submitted to the placeholder from the original area by default. For some reason, the default scope is always โ€œin scopeโ€?

If graph graphs can span different scopes, and names in different scopes cannot be divided, unless they belong to the exact same node, then this will apparently win in order to use different scopes. What i don't understand here?

Thanks!

+5
source share
1 answer

What is the right way to move the trained LSTM weights from one graph to another?

First you can create a decoding graph (with the object saved to save the parameters) and create a GraphDef object that you can import on your larger training schedule:

 basegraph = tf.Graph() with basegraph.as_default(): ***your graph*** traingraph = tf.Graph() with traingraph.as_default(): tf.import_graph_def(basegraph.as_graph_def()) ***your training graph*** 

make sure you load the variables at session start for the new chart.

I have no experience with this functionality, so you may have to work on it a bit.

Is it right to say that starting a new session "frees up resources", but does not erase the graph built in memory?

yep, the chart object still holds it

It seems to me that the "reuse" function allows Tensorflow to look for variables with the same name (existing in another area) outside the region of the current variable and use them in the current region. It's right? If so, what happens to all edges of the graph from a non-current region that refers to this variable? If this is not the case, why does Tensorflow throw an error if you try to get the same variable name in two different areas? It seems quite reasonable to define two variables with the same name in two different areas, for example. conv1 / sum1 and conv2 / sum1.

No, reuse should determine the behavior when you use get_variable for an existing name, when it is true, it will return an existing variable, otherwise it will return a new one. Typically, a tensor flow should not cause an error. Are you sure you are using tf.get_variable, not just tf.Variable?

In my code, I work in a new area, but the chart will not work without data that will be loaded into the placeholder from the original area by default. Is the default scope used?

I really don't understand what you mean. They should not always be used. If a placeholder is not required to perform the operation, you do not need to define it.

If graph graphs can span different scopes, and names in different scopes cannot be divided, if they do not belong to the exact same node, then this would seem to exceed the goal of using different scopes in the first place. What i don't understand here?

I think your understanding or use of areas is distorted, see above

+2
source

All Articles