I am trying to use the tf.contrib.seq2seq module to predict some data (float32 vectors only), but all the examples that I found with the TensorFlow seq2seq module are used for translation and therefore for attachments.
I am trying to understand what exactly tf.contrib.seq2seq.Helper does for the Seq2Seq architecture and how I can use CustomHelper in my case.
This is what I did now:
import tensorflow as tf
from tensorflow.python.layers import core as layers_core
input_seq_len = 15 # Sequence length as input
input_dim = 1 # Nb of features in input
output_seq_len = forecast_len = 20 # horizon length for forecasting
output_dim = 1 # nb of features to forecast
encoder_units = 200 # nb of units in each cell for the encoder
decoder_units = 200 # nb of units in each cell for the decoder
attention_units = 100
batch_size = 8
graph = tf.Graph()
with graph.as_default():
learning_ = tf.placeholder(tf.float32)
with tf.variable_scope('Seq2Seq'):
enc_input = tf.placeholder(tf.float32, [None, input_seq_len, input_dim])
target = tf.placeholder(tf.float32, [None, output_seq_len, output_dim])
encoder_cell = tf.nn.rnn_cell.BasicLSTMCell(encoder_units)
initial_state = encoder_cell.zero_state(batch_size, dtype=tf.float32)
encoder_outputs, encoder_state = tf.nn.dynamic_rnn(encoder_cell, enc_input, initial_state=initial_state)
attention_mechanism_bahdanau = tf.contrib.seq2seq.BahdanauAttention(
num_units = attention_units, # depth of query mechanism
memory = encoder_outputs, # hidden states to attend (output of RNN)
normalize=False, # normalize energy term
name='BahdanauAttention')
attention_mechanism_luong = tf.contrib.seq2seq.LuongAttention(
num_units = encoder_units,
memory = encoder_outputs,
scale=False,
name='LuongAttention'
)
projection = layers_core.Dense(output_dim, use_bias=True, name="output_projection")
helper = tf.contrib.seq2seq.TrainingHelper(target, sequence_length=[output_seq_len for _ in range(batch_size)])
decoder_cell = tf.nn.rnn_cell.BasicLSTMCell(decoder_units)
attention_cell = tf.contrib.seq2seq.AttentionWrapper(
cell = decoder_cell,
attention_mechanism = attention_mechanism_luong, # Instance of AttentionMechanism
attention_layer_size = attention_units,
name="attention_wrapper")
initial_state = attention_cell.zero_state(batch_size=batch_size, dtype=tf.float32)
initial_state = initial_state.clone(cell_state=encoder_state)
decoder = tf.contrib.seq2seq.BasicDecoder(attention_cell, initial_state=initial_state, helper=helper, output_layer=projection)
outputs, _, _ = tf.contrib.seq2seq.dynamic_decode(decoder=decoder)
loss = 0.5*tf.reduce_sum(tf.square(outputs[0] - target), -1)
loss = tf.reduce_mean(loss, 1)
loss = tf.reduce_mean(loss)
optimizer = tf.train.AdamOptimizer(learning_).minimize(loss)
I realized that the learning state and output state are significantly different for the Seq2seq architecture, but I don’t know how to use the helpers from the module to distinguish them. I use this module because it is very useful for layers of attention. How can I use Helper to create ['Go', [input_sequence]] for the decoder?