TensorFlow attention_decoder with RNNCell (state_is_tuple = True)

Question

TensorFlow attention_decoder with RNNCell (state_is_tuple = True)

I want to build a seq2seq model using object_decoder and use MultiRNNCell with LSTMCell as an encoder. Since the TensorFlow code assumes that "this default behavior (state_is_tuple = False) will soon become obsolete." I set state_is_tuple = True for the encoder.

The problem is that when I pass the encoder state to focus_decoder, it reports an error:

*** AttributeError: 'LSTMStateTuple' object has no attribute 'get_shape'

This problem seems to be related to the attention function () in seq2seq.py and the _linear () function in rnn_cell.py, in which the code calls the get_shape () function of the LSTMStateTuple object from initial_state generated by the encoder.

Although the error disappears when I set state_is_tuple = False for the encoder, the program gives the following warning:

WARNING:tensorflow:<tensorflow.python.ops.rnn_cell.LSTMCell object at 0x11763dc50>: Using a concatenated state is slower and will soon be deprecated.  Use state_is_tuple=True.

I would really appreciate if anyone could give any instructions on creating seq2seq using RNNCell (state_is_tuple = True).

+4

tensorflow

user2309694 Jun 26 '16 at 10:50

source share

1 answer

ssjadon · Answer 1 · 2016-08-23T00:32:24+0000

I also ran into this problem, lstm states need to be combined, otherwise I _linearwill complain. The shape LSTMStateTupledepends on the type of cell used. Using the LSTM cell, you can connect the states as follows:

 query = tf.concat(1,[state[0], state[1]])

If you use MultiRNNCell, first merge the states for each layer:

 concat_layers = [tf.concat(1,[c,h]) for c,h in state]
 query = tf.concat(1, concat_layers)

TensorFlow attention_decoder with RNNCell (state_is_tuple = True)

More articles: